From mboxrd@z Thu Jan 1 00:00:00 1970 From: James A. Sutherland Subject: Re: suspend processes at load (was Re: a simple OOM ...) Date: Sat, 21 Apr 2001 07:08:29 +0100 Message-ID: <6u72et8hqnb32nd6da881frgpulnve8rj7@4ax.com> References: In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8BIT Sender: owner-linux-mm@kvack.org Return-Path: To: Szabolcs Szakacsits Cc: linux-mm@kvack.org List-ID: On Fri, 20 Apr 2001 14:25:36 +0200 (MET DST), you wrote: >On Thu, 19 Apr 2001, James A. Sutherland wrote: > >> Rik and I are both proposing that, AFAICS; however it's implemented > >Is it implemented? So why wasting words? Why don't you send the patch >for tests? It isn't. You've mangled my sentence, changing the meaning... >> since I think it could be done more neatly) you just suspend the >> process for a couple of seconds, > >Processes are already suspended in __alloc_pages() for potentially >infinitely. This could explain why you see no progress and perhaps also >other people's problems who reported lockups on lkml. I run with a patch >that prevents this infinite looping in __alloc_pages(). Yep, that's the whole problem. One process starts running, page faults, so another starts running and faults - if you have enough faults before you get back to the first process, it will fault again straight away because you've swapped the page it was waiting for back out! >So suspend at page level didn't help, now comes process level. What >next? Because it will not help either. It will... "suspend at page level" is part of the problem: you need to make sure the process gets a chance to USE the memory it just faulted in. >What would help from kernel level? > >o reserved root vm, Not much; OK, it would allow you to log in and kill the runaway processes, but that's it. An Alt+SysRq key could do the same... >class/fair share scheduling (I run with the former > and helps a lot to take control back [well to be honest, your > statements about reboots are completely false even without too strict > resource limits]) I was speaking from personal experience there... >o non-overcommit [per process granularity and/or virtual swap spaces > would be nice as well] Non-overcommit would just make matters worse: you would get the same results but with a big chunk of swap space wasted "just in case". How exactly would that help? >o better system monitoring: more info, more efficiently, smaller > latencies [I mean 1 sec is ok but not the occasional 10+ sec > accumulated stats that just hide a problem. This seems inrelevant but > would help users and kernel developers to understand better a particular > workload and tune or fix things (possibly not with the currently > popular hard coded values). You don't need system monitoring to detect thrashing: this would be like fitting a warning light to your car to indicate "You've hit something!": other subtle hints like the loud noise, the impact and the change in car shape should convey this information already. >As Stephen mentioned there are many [other] ways to improve things and I >think process suspension is just the wrong one. It's the best approach in the pathological cases where we NEED to do something drastic or we lose the box. >> Indeed. It would certainly help with the usual test-case for such >> things ("make -j 50" or similar): you'll end up with 40 gcc processes >> being frozen at once, allowing the other 10 to complete first. > >Can I recommend a real life test-case? Constant/increasing rate hit >to a dynamic web server. Yep, OK; let's assume it's a prefork server like Apache 1.3, so you have lots of independent processes, each serving one client. Right now, each request will hit a thrashing process. On non-thrashing systems (running in RAM) the request takes 1 seconds to process. If you're very lucky, thrashing, the request will be handled within two hours. By which time, any real-world browser has given up, and you wasted a lot of resources feeding data to /dev/null. Now we try with process suspension. Again, we'll have Apache's MaxProcesses number of processes running accepting requests, but this time all the active processes are being periodically suspended to allow others to complete. Suppose we can support 10 simultaneous processes, and MaxProcesses is 100; the worst case is then that a 1 second response time goes to 10, instead of every single request timing out. Summary: with process suspension, clients get handled slowly. Without it, requests go to /dev/null and eat CPU on the way. I know which I prefer! James. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/