linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: James A. Sutherland <jas88@cam.ac.uk>
To: Szabolcs Szakacsits <szaka@f-secure.com>
Cc: linux-mm@kvack.org
Subject: Re: suspend processes at load (was Re: a simple OOM ...)
Date: Sat, 21 Apr 2001 07:08:29 +0100	[thread overview]
Message-ID: <6u72et8hqnb32nd6da881frgpulnve8rj7@4ax.com> (raw)
In-Reply-To: <Pine.LNX.4.30.0104201253500.20939-100000@fs131-224.f-secure.com>

On Fri, 20 Apr 2001 14:25:36 +0200 (MET DST), you wrote:
>On Thu, 19 Apr 2001, James A. Sutherland wrote:
>
>> Rik and I are both proposing that, AFAICS; however it's implemented
>
>Is it implemented? So why wasting words? Why don't you send the patch
>for tests?

It isn't. You've mangled my sentence, changing the meaning...

>> since I think it could be done more neatly) you just suspend the
>> process for a couple of seconds,
>
>Processes are already suspended in __alloc_pages() for potentially
>infinitely. This could explain why you see no progress and perhaps also
>other people's problems who reported lockups on lkml. I run with a patch
>that prevents this infinite looping in __alloc_pages().

Yep, that's the whole problem. One process starts running, page
faults, so another starts running and faults - if you have enough
faults before you get back to the first process, it will fault again
straight away because you've swapped the page it was waiting for back
out!

>So suspend at page level didn't help, now comes process level. What
>next? Because it will not help either.

It will... "suspend at page level" is part of the problem: you need to
make sure the process gets a chance to USE the memory it just faulted
in.

>What would help from kernel level?
>
>o reserved root vm, 

Not much; OK, it would allow you to log in and kill the runaway
processes, but that's it. An Alt+SysRq key could do the same...

>class/fair share scheduling (I run with the former
>  and helps a lot to take control back [well to be honest, your
>  statements about reboots are completely false even without too strict
>  resource limits])

I was speaking from personal experience there...

>o non-overcommit [per process granularity and/or virtual swap spaces
>  would be nice as well]

Non-overcommit would just make matters worse: you would get the same
results but with a big chunk of swap space wasted "just in case". How
exactly would that help?

>o better system monitoring: more info, more efficiently, smaller
>  latencies [I mean 1 sec is ok but not the occasional 10+ sec
>  accumulated stats that just hide a problem. This seems inrelevant but
>  would help users and kernel developers to understand better a particular
>  workload and tune or fix things (possibly not with the currently
>  popular hard coded values).

You don't need system monitoring to detect thrashing: this would be
like fitting a warning light to your car to indicate "You've hit
something!": other subtle hints like the loud noise, the impact and
the change in car shape should convey this information already.

>As Stephen mentioned there are many [other] ways to improve things and I
>think process suspension is just the wrong one.

It's the best approach in the pathological cases where we NEED to do
something drastic or we lose the box.

>> Indeed. It would certainly help with the usual test-case for such
>> things ("make -j 50" or similar): you'll end up with 40 gcc processes
>> being frozen at once, allowing the other 10 to complete first.
>
>Can I recommend a real life test-case? Constant/increasing rate hit
>to a dynamic web server.

Yep, OK; let's assume it's a prefork server like Apache 1.3, so you
have lots of independent processes, each serving one client. Right
now, each request will hit a thrashing process. On non-thrashing
systems (running in RAM) the request takes 1 seconds to process. If
you're very lucky, thrashing, the request will be handled within two
hours. By which time, any real-world browser has given up, and you
wasted a lot of resources feeding data to /dev/null.

Now we try with process suspension. Again, we'll have Apache's
MaxProcesses number of processes running accepting requests, but this
time all the active processes are being periodically suspended to
allow others to complete. Suppose we can support 10 simultaneous
processes, and MaxProcesses is 100; the worst case is then that a 1
second response time goes to 10, instead of every single request
timing out.

Summary: with process suspension, clients get handled slowly. Without
it, requests go to /dev/null and eat CPU on the way. I know which I
prefer!


James.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

  reply	other threads:[~2001-04-21  6:08 UTC|newest]

Thread overview: 85+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2001-04-12 16:58 [PATCH] a simple OOM killer to save me from Netscape Slats Grobnik
2001-04-12 18:25 ` Rik van Riel
2001-04-12 18:49   ` James A. Sutherland
2001-04-13  6:45   ` Eric W. Biederman
2001-04-13 16:20     ` Rik van Riel
2001-04-14  1:20       ` Stephen C. Tweedie
2001-04-16 21:06         ` James A. Sutherland
2001-04-16 21:40           ` Jonathan Morton
2001-04-16 22:12             ` Rik van Riel
2001-04-16 22:21             ` James A. Sutherland
2001-04-17 14:26               ` Jonathan Morton
2001-04-17 19:53                 ` Rik van Riel
2001-04-17 20:44                   ` James A. Sutherland
2001-04-17 20:59                     ` Jonathan Morton
2001-04-17 21:09                       ` James A. Sutherland
2001-04-14  7:00       ` Eric W. Biederman
2001-04-15  5:05         ` Rik van Riel
2001-04-15  5:20           ` Rik van Riel
2001-04-16 11:52         ` Szabolcs Szakacsits
2001-04-16 12:17       ` suspend processes at load (was Re: a simple OOM ...) Szabolcs Szakacsits
2001-04-17 19:48         ` Rik van Riel
2001-04-18 21:32           ` Szabolcs Szakacsits
2001-04-18 20:38             ` James A. Sutherland
2001-04-18 23:25               ` Szabolcs Szakacsits
2001-04-18 22:29                 ` Rik van Riel
2001-04-19 10:14                   ` Stephen C. Tweedie
2001-04-19 13:23                   ` Szabolcs Szakacsits
2001-04-19  2:11                 ` Rik van Riel
2001-04-19  7:08                   ` James A. Sutherland
2001-04-19 13:37                     ` Szabolcs Szakacsits
2001-04-19 12:26                       ` Christoph Rohland
2001-04-19 12:30                       ` James A. Sutherland
2001-04-19  9:15                 ` James A. Sutherland
2001-04-19 18:34             ` Dave McCracken
2001-04-19 18:47               ` James A. Sutherland
2001-04-19 18:53                 ` Dave McCracken
2001-04-19 19:10                   ` James A. Sutherland
2001-04-20 14:58                     ` Rik van Riel
2001-04-21  6:10                       ` James A. Sutherland
2001-04-19 19:13                   ` Rik van Riel
2001-04-19 19:47                     ` Gerrit Huizenga
2001-04-20 12:44                       ` Szabolcs Szakacsits
2001-04-19 20:06                     ` James A. Sutherland
2001-04-20 12:29                     ` Szabolcs Szakacsits
2001-04-20 11:50                       ` Jonathan Morton
2001-04-20 13:32                         ` Szabolcs Szakacsits
2001-04-20 14:30                           ` Rik van Riel
2001-04-22 10:21                       ` James A. Sutherland
2001-04-20 12:25                 ` Szabolcs Szakacsits
2001-04-21  6:08                   ` James A. Sutherland [this message]
2001-04-20 12:18               ` Szabolcs Szakacsits
2001-04-22 10:19                 ` James A. Sutherland
2001-04-17 10:58 ` limit for number of processes Uman
2001-04-19 14:03 suspend processes at load (was Re: a simple OOM ...) Jonathan Morton
2001-04-19 18:25 ` Dave McCracken
2001-04-19 18:32   ` James A. Sutherland
2001-04-19 20:23     ` Jonathan Morton
2001-04-20 12:14     ` Szabolcs Szakacsits
2001-04-20 12:02       ` Jonathan Morton
2001-04-20 14:48       ` Dave McCracken
2001-04-21  5:49       ` James A. Sutherland
2001-04-21 19:16         ` Joseph A. Knapka
2001-04-21 19:41           ` Jonathan Morton
2001-04-22 10:08             ` James A. Sutherland
2001-04-22 16:53               ` Jonathan Morton
2001-04-22 17:06                 ` James A. Sutherland
2001-04-22 18:18                   ` Jonathan Morton
2001-04-22 18:57                     ` Rik van Riel
2001-04-22 19:41                       ` James A. Sutherland
2001-04-22 20:33                         ` Jean Francois Martinez
2001-04-22 20:21                       ` Jonathan Morton
2001-04-22 20:36                         ` Jonathan Morton
2001-04-22 19:01                     ` James A. Sutherland
2001-04-22 19:11                       ` Rik van Riel
2001-04-22 20:36                         ` James A. Sutherland
2001-04-22 19:30                       ` Jonathan Morton
2001-04-22 20:35                         ` James A. Sutherland
2001-04-22 20:41                           ` Rik van Riel
2001-04-22 20:58                             ` James A. Sutherland
2001-04-22 21:26                               ` Rik van Riel
2001-04-22 22:26                                 ` Jonathan Morton
2001-04-23  5:55                                   ` James A. Sutherland
2001-04-23  5:59                                     ` Rik van Riel
2001-04-21 20:29           ` Rik van Riel
2001-04-22 10:08           ` James A. Sutherland

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6u72et8hqnb32nd6da881frgpulnve8rj7@4ax.com \
    --to=jas88@cam.ac.uk \
    --cc=linux-mm@kvack.org \
    --cc=szaka@f-secure.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox