From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Sun, 29 Oct 2000 20:41:46 +0000 (GMT) From: James Sutherland Subject: Re: Discussion on my OOM killer API In-Reply-To: <20001029203046.A23822@nightmaster.csn.tu-chemnitz.de> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-linux-mm@kvack.org Return-Path: To: Ingo Oeser Cc: Rik van Riel , linux-mm@kvack.org List-ID: On Sun, 29 Oct 2000, Ingo Oeser wrote: > On Fri, Oct 27, 2000 at 06:36:13PM +0100, James Sutherland wrote: > > > If I do the full blown variant of my patch: > > EBADIDEA. The kernel's OOM killer is a last ditch "something's going to > > die - who's first?" - adding extra bloat like this is BAD. > > Ok. So it's easier for me ;-) Also a plus point, I think :) > > Policy should be decided user-side, and should prevent the kernel-side > > killer EVER triggering. > > So your user space OOM handler would like to be notified on > memory *pressure* (not only about OOM)? You would like to shrink > image caches and the like with it? Sounds sane. That's a bit more elaborate than my plan, but I'd like to get that at some point (2.5?) - "Hey, X, we're getting short on RAM here. How about ditching a few cached bitmaps?" etc. My plan for now was simpler: my handler gets notified "only x Mb left!" and works down a priority list of things to try at each threshold: maybe kill netscapes if we're down to 2 Mb, send a SIGDANGER to everyone at 1 Mb, reboot if we drop below 0.5 Mb. > But then we need information on _how_ much memory we need. I > could pass allocation "priority" to user space, but I doubt that > will be descriptive enough. Memory pressure is very difficult to measure, but if the configuration is flexible enough the daemon should be able to handle most things. > > I was planning to implement a user-side OOM killer myself - perhaps we > > could split the work, you do kernel-side, I'll do the userspace bits? > > If you could clarify, what events you actually like to get, I > could implement this a loadable OOM handler. The main event I'm interested in is "free memory below " (and "free memory above "; users may want things restarted once the problem has gone). Having the same facility for short-term load average might be nice, too, to catch fork-bombs etc. before they hurt the system too badly. The mechanism I have in mind is something like this: 1. System boots up, init loads /sbin/resourced 2. resourced works out the nearest threshold of interest, and writes this to /dev/resources, then performs a blocking read (i.e. sleeps until that threshold is crossed). (Runs as root, realtime maximum priority.) 3. As a threshold is crossed, the read will return appropriate information: which threshold was crossed, for example. This wakes resourced, which takes appropriate action. For a simple configuration, I'd probably just want to warn logged-in users when we drop below, say, 5 Mb, then start killing things at 1 Mb. > But still my patch is much more flexible, since it even allows to > panic & reboot. I would prefer that on embedded systems, which > boot _really_ fast, over blowing the system by adding mutally > watching to my important processes. I'd rather reboot cleanly from userspace if possible, avoiding the panic() route. OK, in an embedded system you may not have big RAID arrays to fsck etc., but I think most programmers would rather not have the system panic unnecessarily? James. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/