Re: on load control / process swapping

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Matt Dillon <dillon@earth.backplane.com>
To: Rik van Riel <riel@conectiva.com.br>
Cc: arch@freebsd.org, linux-mm@kvack.org, sfkaplan@cs.amherst.edu
Subject: Re: on load control / process swapping
Date: Sat, 12 May 2001 16:58:14 -0700 (PDT)	[thread overview]
Message-ID: <200105122358.f4CNwEr20137@earth.backplane.com> (raw)
In-Reply-To: <Pine.LNX.4.21.0105121109210.5468-100000@imladris.rielhome.conectiva>

    Consider the case where you have one large process and many small
    processes.  If you were to skew things to allow the large process to
    run at the cost of all the small processes, you have just inconvenienced
    98% of your users so one ozob can run a big job.  Not only that, but 
    there is no guarentee that the 'big job' will ever finish (a topic of
    many a paper on scheduling, BTW)... what if it's been running for hours
    and still has hours to go?  Do we blow away the rest of the system to
    let it run?  

    What if there are several big jobs?  If you skew things in favor of
    one the others could take 60 seconds *just* to recover their RSS when
    they are finally allowed to run.  So much for timesharing... you
    would have to run each job exclusively for 5-10 minutes at a time
    to get any sort of effiency, which is not practical in a timeshare
    system.  So there is really very little that you can do.

:Indeed, the speed limiting of the pageout scanning takes care of
:this. But still, having the swapout threshold defined as being
:short of inactive pages while the swapin threshold uses the number
:of free+cache pages as an indication could lead to the situation
:where you suspend and wake up processes while it isn't needed.
:
:Or worse, suspending one process which easily fit in memory and
:then waking up another process, which cannot be swapped in because
:the first process' memory is still sitting in RAM and cannot be
:removed yet due to the pageout scan speed limiting (and also cannot
:be used, because we suspended the process).

    We don't suspend running processes, but I do believe FreeBSD is still
    vulnerable to this issue.  Suspending the marked process when it hits the
    vm_fault code is a good idea and would solve the problem.  If the process
    never takes an allocation fault, it probably doesn't have to be swapped
    out.  The normal pageout would suffice for that process.

:>     The pagein and pageout rates have nothing to do with thrashing, per say,
:>     and should never be arbitrarily limited.
:
:But they are, with the pageout daemon going to sleep for half a
:second if it doesn't succeed in freeing enough memory at once.
:It even does this if a large part of the memory on the active
:list belongs to a process which has just been suspended because
:of thrashing...

    No.  I did say the code was complex.  A process which has been
    suspended for thrashing gets all of its pages depressed in priority.
    The page daemon would have no problem recovering the pages.   See
    line 1458 of vm_pageout.c.  This code also enforces the 'memoryuse'
    resource limit (which is perhaps even more important).  It is not
    necessary to try to launder the pages immediately.  Simply depressing
    their priority is sufficient and it allows for quicker recovery when
    the thrashing goes away.  It also allows us to implement the 
    vm.swap_idle_{threshold1,threshold2,enabled} sysctls trivially, which
    results in proactive swapping that is extremely useful in certain
    situations (like shell machines with lots of idle users).

    The pagedaemon gets behind when there are too many
    active pages in the system and the pagedaemon is unable to move them
    to the inactive queue due to the pages still being very active... that is,
    when the active resident set for all processes in the system exceeds
    available memory.  This is what triggers thrashing.  Swapping has the
    side effect of reducing the total active resident set for the system
    as a whole, fixing the thrashing problem. 

						-Matt

:>     I don't think it's possible to write a nice neat thrash-handling
:>     algorithm.  It's a bunch of algorithms all working together, all
:>     closely tied to the VM page cache.  Each taken alone is fairly easy
:>     to describe and understand.  All of them together result in complex
:>     interactions that are very easy to break if you make a mistake.
:
:Heheh, certainly true ;)
:
:cheers,
:
:Rik
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

next prev parent reply	other threads:[~2001-05-12 23:58 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2001-05-07 21:16 Rik van Riel
2001-05-07 22:50 ` Matt Dillon
2001-05-07 23:35   ` Rik van Riel
2001-05-08  0:56     ` Matt Dillon
2001-05-12 14:23       ` Rik van Riel
2001-05-12 17:21         ` Matt Dillon
2001-05-12 21:17           ` Rik van Riel
2001-05-12 23:58         ` Matt Dillon [this message]
2001-05-13 17:22           ` Rik van Riel
2001-05-15  6:38             ` Terry Lambert
2001-05-15 13:39               ` Cy Schubert - ITSD Open Systems Group
2001-05-15 15:31               ` Rik van Riel
2001-05-15 17:24               ` Matt Dillon
2001-05-15 23:55                 ` Roger Larsson
2001-05-16  0:16                   ` Matt Dillon
2001-05-16  4:22                     ` Kernel Debugger Amarnath Jolad
2001-05-16  7:58                       ` Kris Kennaway
2001-05-16 11:42                       ` Martin Frey
2001-05-16 12:04                         ` R.Oehler
2001-05-16  8:23                 ` on load control / process swapping Terry Lambert
2001-05-16 17:26                   ` Matt Dillon
2001-05-08 20:52   ` Kirk McKusick
2001-05-09  0:18     ` Matt Dillon
2001-05-09  2:07       ` Peter Jeremy
2001-05-09 19:41         ` Matt Dillon
2001-05-12 14:28       ` Rik van Riel
2001-05-08 12:25 ` Scott F. Kaplan
2001-05-16 15:17 Charles Randall
2001-05-16 17:14 Matt Dillon
2001-05-16 17:41 ` Rik van Riel
2001-05-16 17:54   ` Matt Dillon
2001-05-18  5:58     ` Terry Lambert
2001-05-18  6:20       ` Matt Dillon
2001-05-18 10:00         ` Andrew Reilly
2001-05-18 13:49         ` Jonathan Morton
2001-05-19  2:18           ` Rik van Riel
2001-05-19  2:56             ` Jonathan Morton
2001-05-16 17:57   ` Alfred Perlstein
2001-05-16 18:01     ` Matt Dillon
2001-05-16 18:10       ` Alfred Perlstein
     [not found] <OF5A705983.9566DA96-ON86256A50.00630512@hou.us.ray.com>
2001-05-18 20:13 ` Jonathan Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200105122358.f4CNwEr20137@earth.backplane.com \
    --to=dillon@earth.backplane.com \
    --cc=arch@freebsd.org \
    --cc=linux-mm@kvack.org \
    --cc=riel@conectiva.com.br \
    --cc=sfkaplan@cs.amherst.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox