Re: on load control / process swapping

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Matt Dillon <dillon@earth.backplane.com>
To: Rik van Riel <riel@conectiva.com.br>
Cc: arch@freebsd.org, linux-mm@kvack.org, sfkaplan@cs.amherst.edu
Subject: Re: on load control / process swapping
Date: Mon, 7 May 2001 15:50:20 -0700 (PDT)	[thread overview]
Message-ID: <200105072250.f47MoKe68863@earth.backplane.com> (raw)
In-Reply-To: <Pine.LNX.4.21.0105061924160.582-100000@imladris.rielhome.conectiva>

    This is accomplished as a side effect to the way the page queues
    are handled.  A page placed in the active queue is not allowed
    to be moved out of that queue for a minimum period of time based
    on page aging.  See line 500 or so of vm_pageout.c (in -stable) .

    Thus when a process wakes up and pages a bunch of pages in, those
    pages are guarenteed to stay in-core for a period of time no matter
    what level of memory stress is occuring.

:2) make sure the resident processes aren't thrashing,
:   that is, don't let new processes back in memory if
:   none of the currently resident processes is "ready"
:   to be suspended

    When a process is swapped out, the process is removed from the run
    queue and the P_INMEM flag is cleared.  The process is only woken up
    when faultin() is called (vm_glue.c line 312).  faultin() is only
    called from the scheduler() (line 340 of vm_glue.c) and the scheduler
    only runs when the VM system indicates a minimum number of free pages
    are available (vm_page_count_min()), which you can adjust with
    the vm.v_free_min sysctl (usually represents 1-9 megabytes, dependings
    on how much memory the system has).

    So what occurs is that the system comes under extreme memory pressure
    and starts to swapout blocked processes.  This reduces memory pressure
    over time.  When memory pressure is sufficiently reudced the scheduler
    wakes up a swapped-out process (one at a time).

    There might be some fine tuning that we can do here, such as try to
    choose a better process to swapout (right now it's priority based which
    isn't the best way to do it).

:3) have a mechanism to detect thrashing in a VM
:   subsystem which isn't rate-limited  (hard?)

    In FreeBSD, rate-limiting is a function of a lightly loaded system.
    We rate-limit page laundering (pageouts).  However, if the rate-limited
    laundering is not sufficient to reach our free + cache page targets,
    we take another laundering loop and this time do not limit it at all.

    Thus under heavy memory pressure, no real rate limiting occurs.  The
    system will happily pagein and pageout megabytes/sec.  The reason we
    do this is because David Greenman and John Dyson found a long time
    ago that attempting to rate limit paging does not actually solve the
    thrashing problem, it actually makes it worse... So they solved the
    problem another way (see my answers for #1 and #2).  It isn't the
    paging operations themselves that cause thrashing.

:and, for extra brownie points:
:4) fairness, small processes can be paged in and out
:   faster, so we can suspend&resume them faster; this
:   has the side effect of leaving the proverbial root
:   shell more usable

    Small process can contribute to thrashing as easily as large
    processes can under extreme memory pressure... for example,
    take an overloaded shell machine.  *ALL* processes are 'small'
    processes in that case, or most of them are, and in great numbers
    they can be the cause.  So no test that specifically checks the
    size of the process can be used to give it any sort of priority.

    Additionally, *idle* small processes are also great contributers 
    to the VM subsystem in regards to clearing out idle pages.  For
    example, on a heavily loaded shell machine more then 80% of the
    'small processes' have been idle for long periods of time and it
    is exactly our ability to page them out that allows us to extend
    the machine's operational life and move the thrashing threshold
    farther away.  The last thing we want to do is make a 'fix' that
    prevents us from paging out idle small processes.  It would kill
    the machine.

:5) make sure already resident processes cannot create
:   a situation that'll keep the swapped out tasks out
:   of memory forever ... but don't kill performance either,
:   since bad performance means we cannot get out of the
:   bad situation we're in

    When the system starts swapping processes out, it continues to swap
    them out until memory pressure goes down.  With memory pressure down
    processes are swapped back in again one at a time, typically in FIFO
    order.  So this situation will generally not occur.

    Basically we have all the algorithms in place to deal with thrashing.
    I'm sure that there are a few places where we can optimize things...
    for example, we can certainly tune the swapout algorithm itself.

						-Matt

:regards,
:
:Rik
:--
:Virtual memory is like a game you can't win;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

next prev parent reply	other threads:[~2001-05-07 22:50 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2001-05-07 21:16 Rik van Riel
2001-05-07 22:50 ` Matt Dillon [this message]
2001-05-07 23:35   ` Rik van Riel
2001-05-08  0:56     ` Matt Dillon
2001-05-12 14:23       ` Rik van Riel
2001-05-12 17:21         ` Matt Dillon
2001-05-12 21:17           ` Rik van Riel
2001-05-12 23:58         ` Matt Dillon
2001-05-13 17:22           ` Rik van Riel
2001-05-15  6:38             ` Terry Lambert
2001-05-15 13:39               ` Cy Schubert - ITSD Open Systems Group
2001-05-15 15:31               ` Rik van Riel
2001-05-15 17:24               ` Matt Dillon
2001-05-15 23:55                 ` Roger Larsson
2001-05-16  0:16                   ` Matt Dillon
2001-05-16  4:22                     ` Kernel Debugger Amarnath Jolad
2001-05-16  7:58                       ` Kris Kennaway
2001-05-16 11:42                       ` Martin Frey
2001-05-16 12:04                         ` R.Oehler
2001-05-16  8:23                 ` on load control / process swapping Terry Lambert
2001-05-16 17:26                   ` Matt Dillon
2001-05-08 20:52   ` Kirk McKusick
2001-05-09  0:18     ` Matt Dillon
2001-05-09  2:07       ` Peter Jeremy
2001-05-09 19:41         ` Matt Dillon
2001-05-12 14:28       ` Rik van Riel
2001-05-08 12:25 ` Scott F. Kaplan
2001-05-16 15:17 Charles Randall
2001-05-16 17:14 Matt Dillon
2001-05-16 17:41 ` Rik van Riel
2001-05-16 17:54   ` Matt Dillon
2001-05-18  5:58     ` Terry Lambert
2001-05-18  6:20       ` Matt Dillon
2001-05-18 10:00         ` Andrew Reilly
2001-05-18 13:49         ` Jonathan Morton
2001-05-19  2:18           ` Rik van Riel
2001-05-19  2:56             ` Jonathan Morton
2001-05-16 17:57   ` Alfred Perlstein
2001-05-16 18:01     ` Matt Dillon
2001-05-16 18:10       ` Alfred Perlstein
     [not found] <OF5A705983.9566DA96-ON86256A50.00630512@hou.us.ray.com>
2001-05-18 20:13 ` Jonathan Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200105072250.f47MoKe68863@earth.backplane.com \
    --to=dillon@earth.backplane.com \
    --cc=arch@freebsd.org \
    --cc=linux-mm@kvack.org \
    --cc=riel@conectiva.com.br \
    --cc=sfkaplan@cs.amherst.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox