From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <44739E2D.60406@yahoo.com.au> Date: Wed, 24 May 2006 09:43:41 +1000 From: Nick Piggin MIME-Version: 1.0 Subject: Re: [PATCH (try #3)] mm: avoid unnecessary OOM kills References: <200605230032.k4N0WCIU023760@calaveras.llnl.gov> <4472A006.2090006@yahoo.com.au> <7.0.0.16.2.20060523094646.02429fd8@llnl.gov> In-Reply-To: <7.0.0.16.2.20060523094646.02429fd8@llnl.gov> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: Dave Peterson Cc: linux-kernel@vger.kernel.org, akpm@osdl.org, pj@sgi.com, ak@suse.de, linux-mm@kvack.org, garlick@llnl.gov, mgrondona@llnl.gov List-ID: Dave Peterson wrote: > At 10:39 PM 5/22/2006, Nick Piggin wrote: > >>Does this fix observed problems on real (or fake) workloads? Can we have >>some more information about that? [snip] OK, thanks. >>I still don't quite understand why all this mechanism is needed. Suppose >>that we single-thread the oom kill path (which isn't unreasonable, unless >>you need really good OOM throughput :P), isn't it enough to find that any >>process has TIF_MEMDIE set in order to know that an OOM kill is in progress? >> >>down(&oom_sem); >>for each process { >> if TIF_MEMDIE >> goto oom_in_progress; >> else >> calculate badness; >>} >>up(&oom_sem); > > > That would be another way to do things. It's a tradeoff between either > > option A: Each task that enters the OOM code path must loop over all > tasks to determine whether an OOM kill is in progress. > > or... > > option B: We must declare an oom_kill_in_progress variable and add > the following snippet of code to mmput(): > > put_swap_token(mm); > + if (unlikely(test_bit(MM_FLAG_OOM_NOTIFY, &mm->flags))) > + oom_kill_finish(); /* terminate pending OOM kill */ > mmdrop(mm); > > I think either option is reasonable (although I have a slight preference > for B since it eliminates substantial looping through the tasklist). Don't you have to loop through the tasklist anyway? To find a task to kill? Either way, at the point of OOM, usually they should have gone through the LRU lists several times, so a little bit more CPU time shouldn't hurt. > > >>Is all this really required? Shouldn't you just have in place the >>mechanism to prevent concurrent OOM killings in the OOM code, and >>so the page allocator doesn't have to bother with it at all (ie. >>it can just call into the OOM killer, which may or may not actually >>kill anything). > > > I agree it's desirable to keep the OOM killing logic as encapsulated > as possible. However unless you are holding the oom kill semaphore > when you make your final attempt to allocate memory it's a bit racy. > Holding the OOM kill semaphore guarantees that our final allocation > failure before invoking the OOM killer occurred _after_ any previous > OOM kill victim freed its memory. Thus we know we are not shooting > another process prematurely (i.e. before the memory-freeing effects > of our previous OOM kill have been felt). But there is so much fudge in it that I don't think it matters: pages could be freed from other sources, some reclaim might happen, the point at which OOM is declared is pretty arbitrary anyway, etc. -- SUSE Labs, Novell Inc. Send instant messages to your online friends http://au.messenger.yahoo.com -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org