Re: on load control / process swapping

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Rik van Riel <riel@conectiva.com.br>
To: Matt Dillon <dillon@earth.backplane.com>
Cc: arch@freebsd.org, linux-mm@kvack.org, sfkaplan@cs.amherst.edu
Subject: Re: on load control / process swapping
Date: Mon, 7 May 2001 20:35:25 -0300 (BRST)	[thread overview]
Message-ID: <Pine.LNX.4.33.0105071956180.18102-100000@duckman.distro.conectiva> (raw)
In-Reply-To: <200105072250.f47MoKe68863@earth.backplane.com>

On Mon, 7 May 2001, Matt Dillon wrote:

> :1) allow the resident processes to stay resident long
> :   enough to make progess
>
>     This is accomplished as a side effect to the way the page queues
>     are handled.  A page placed in the active queue is not allowed
>     to be moved out of that queue for a minimum period of time based
>     on page aging.  See line 500 or so of vm_pageout.c (in -stable) .
>
>     Thus when a process wakes up and pages a bunch of pages in, those
>     pages are guarenteed to stay in-core for a period of time no matter
>     what level of memory stress is occuring.

I don't see anything limiting the speed at which the active list
is scanned over and over again. OTOH, you are right that a failure
to deactivate enough pages will trigger the swapout code .....

This sure is a subtle interaction ;)

> :2) make sure the resident processes aren't thrashing,
> :   that is, don't let new processes back in memory if
> :   none of the currently resident processes is "ready"
> :   to be suspended
>
>     When a process is swapped out, the process is removed from the run
>     queue and the P_INMEM flag is cleared.  The process is only woken up
>     when faultin() is called (vm_glue.c line 312).  faultin() is only
>     called from the scheduler() (line 340 of vm_glue.c) and the scheduler
>     only runs when the VM system indicates a minimum number of free pages
>     are available (vm_page_count_min()), which you can adjust with
>     the vm.v_free_min sysctl (usually represents 1-9 megabytes, dependings
>     on how much memory the system has).

But ... is this a good enough indication that the processes
currently resident have enough memory available to make any
progress ?

Especially if all the currently resident processes are waiting
in page faults, won't that make it easier for the system to find
pages to swap out, etc... ?

One thing I _am_ wondering though: the pageout and the pagein
thresholds are different. Can't this lead to problems where we
always hit both the pageout threshold -and- the pagein threshold
and the system thrashes swapping processes in and out ?

> :3) have a mechanism to detect thrashing in a VM
> :   subsystem which isn't rate-limited  (hard?)
>
>     In FreeBSD, rate-limiting is a function of a lightly loaded system.
>     We rate-limit page laundering (pageouts).  However, if the rate-limited
>     laundering is not sufficient to reach our free + cache page targets,
>     we take another laundering loop and this time do not limit it at all.
>
>     Thus under heavy memory pressure, no real rate limiting occurs.  The
>     system will happily pagein and pageout megabytes/sec.  The reason we
>     do this is because David Greenman and John Dyson found a long time
>     ago that attempting to rate limit paging does not actually solve the
>     thrashing problem, it actually makes it worse... So they solved the
>     problem another way (see my answers for #1 and #2).  It isn't the
>     paging operations themselves that cause thrashing.

Agreed on all points ... I'm just wondering how well 1) and 2)
still work after all the changes that were made to the VM in
the last few years.  They sure are subtle ...

> :and, for extra brownie points:
> :4) fairness, small processes can be paged in and out
> :   faster, so we can suspend&resume them faster; this
> :   has the side effect of leaving the proverbial root
> :   shell more usable
>
>     Small process can contribute to thrashing as easily as large
>     processes can under extreme memory pressure... for example,
>     take an overloaded shell machine.  *ALL* processes are 'small'
>     processes in that case, or most of them are, and in great numbers
>     they can be the cause.  So no test that specifically checks the
>     size of the process can be used to give it any sort of priority.

There's a test related to 2) though ... A small process needs
to be in memory less time than a big process in order to make
progress, so it can be swapped out earlier.

It can also be swapped back in earlier, giving small processes
shorter "time slices" for swapping than what large processes
have.  I'm not quite sure how much this would matter, though...

> :5) make sure already resident processes cannot create
> :   a situation that'll keep the swapped out tasks out
> :   of memory forever ... but don't kill performance either,
> :   since bad performance means we cannot get out of the
> :   bad situation we're in
>
>     When the system starts swapping processes out, it continues to swap
>     them out until memory pressure goes down.  With memory pressure down
>     processes are swapped back in again one at a time, typically in FIFO
>     order.  So this situation will generally not occur.
>
>     Basically we have all the algorithms in place to deal with thrashing.
>     I'm sure that there are a few places where we can optimize things...
>     for example, we can certainly tune the swapout algorithm itself.

Interesting, FreeBSD indeed _does_ seem to have all of the things in
place (though the interactions between the various parts seem to be
carefully hidden ;)).

They indeed should work for lots of scenarios, but things like the
subtlety of some of the code and the fact that the swapin and
swapout thresholds are fairly unrelated look a bit worrying...

regards,

Rik
--
Linux MM bugzilla: http://linux-mm.org/bugzilla.shtml

Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

		http://www.surriel.com/
http://www.conectiva.com/	http://distro.conectiva.com/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

next prev parent reply	other threads:[~2001-05-07 23:35 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2001-05-07 21:16 Rik van Riel
2001-05-07 22:50 ` Matt Dillon
2001-05-07 23:35   ` Rik van Riel [this message]
2001-05-08  0:56     ` Matt Dillon
2001-05-12 14:23       ` Rik van Riel
2001-05-12 17:21         ` Matt Dillon
2001-05-12 21:17           ` Rik van Riel
2001-05-12 23:58         ` Matt Dillon
2001-05-13 17:22           ` Rik van Riel
2001-05-15  6:38             ` Terry Lambert
2001-05-15 13:39               ` Cy Schubert - ITSD Open Systems Group
2001-05-15 15:31               ` Rik van Riel
2001-05-15 17:24               ` Matt Dillon
2001-05-15 23:55                 ` Roger Larsson
2001-05-16  0:16                   ` Matt Dillon
2001-05-16  4:22                     ` Kernel Debugger Amarnath Jolad
2001-05-16  7:58                       ` Kris Kennaway
2001-05-16 11:42                       ` Martin Frey
2001-05-16 12:04                         ` R.Oehler
2001-05-16  8:23                 ` on load control / process swapping Terry Lambert
2001-05-16 17:26                   ` Matt Dillon
2001-05-08 20:52   ` Kirk McKusick
2001-05-09  0:18     ` Matt Dillon
2001-05-09  2:07       ` Peter Jeremy
2001-05-09 19:41         ` Matt Dillon
2001-05-12 14:28       ` Rik van Riel
2001-05-08 12:25 ` Scott F. Kaplan
2001-05-16 15:17 Charles Randall
2001-05-16 17:14 Matt Dillon
2001-05-16 17:41 ` Rik van Riel
2001-05-16 17:54   ` Matt Dillon
2001-05-18  5:58     ` Terry Lambert
2001-05-18  6:20       ` Matt Dillon
2001-05-18 10:00         ` Andrew Reilly
2001-05-18 13:49         ` Jonathan Morton
2001-05-19  2:18           ` Rik van Riel
2001-05-19  2:56             ` Jonathan Morton
2001-05-16 17:57   ` Alfred Perlstein
2001-05-16 18:01     ` Matt Dillon
2001-05-16 18:10       ` Alfred Perlstein
     [not found] <OF5A705983.9566DA96-ON86256A50.00630512@hou.us.ray.com>
2001-05-18 20:13 ` Jonathan Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.33.0105071956180.18102-100000@duckman.distro.conectiva \
    --to=riel@conectiva.com.br \
    --cc=arch@freebsd.org \
    --cc=dillon@earth.backplane.com \
    --cc=linux-mm@kvack.org \
    --cc=sfkaplan@cs.amherst.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox