From mboxrd@z Thu Jan 1 00:00:00 1970 From: Nick Piggin Subject: Re: [PATCH 11 of 11] not-wait-memdie Date: Tue, 8 Jan 2008 14:25:31 +1100 References: <504e981185254a12282d.1199326157@v2.random> In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200801081425.31515.nickpiggin@yahoo.com.au> Sender: owner-linux-mm@kvack.org Return-Path: To: David Rientjes Cc: Christoph Lameter , Andrea Arcangeli , linux-mm@kvack.org, Andrew Morton List-ID: On Tuesday 08 January 2008 12:57, David Rientjes wrote: > On Mon, 7 Jan 2008, Christoph Lameter wrote: > > > + if (unlikely(test_tsk_thread_flag(p, TIF_MEMDIE))) { > > > + /* > > > + * Hopefully we already waited long enough, > > > + * or exit_mm already run, but we must try to kill > > > + * another task to avoid deadlocking. > > > + */ > > > + continue; > > > + } > > > > If all tasks are marked TIF_MEMDIE then we just scan through them return > > NULL and > > That's the problem that I've been mentioning: giving several tasks access > to memory reserves just isn't right. We already do that today in the case of regular page reclaim. The problem is the global reserve. Once you have a kernel that doesn't need this handwavy global reserve for forward progress, a lot of little problems go away. > It should be given to a single > OOM-killed task that will alleviate the OOM condition for the task that > called out_of_memory(). It should be, but that task you OOM may be blocking on another one that is waiting for memory, for example. In practice, I think a task will not need a great deal of memory in order to finish what it is doing and exit; but it will be more likely to be in some oom deadlock. So neither solution is perfect, but I think this patch will solve more cases than it introduces. > For an entire system it would still be possible > for several tasks to be TIF_MEMDIE (in the case of cpuset, memory > controller, or mempolicy OOM killing) but never more than one task that > shares a common zone. > > > > /* Found nothing?!?! Either we hang forever, or we panic. */ > > > - if (!p) { > > > + if (unlikely(!p)) { > > > read_unlock(&tasklist_lock); > > > panic("Out of memory and no killable processes...\n"); > > > > panic. > > > > Should we not wait awhile before panicing? The processes may need some > > time to terminate. > > That's only possible with my proposal of adding > > unsigned long oom_kill_jiffies; > > to struct task_struct. We can't get away with a system-wide jiffies > variable, nor can we get away with per-cgroup, per-cpuset, or > per-mempolicy variable. The only way to clear such a variable is in the > exit path (by checking test_thread_flag(tsk, TIF_MEMDIE) in do_exit()) and > fails miserably if there are simultaneous but zone-disjoint OOMs > occurring. Why not just have a global frequency limit on OOM events. Then the panic has this delay factored in... -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org