From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail202.messagelabs.com (mail202.messagelabs.com [216.82.254.227]) by kanga.kvack.org (Postfix) with SMTP id 5AA77600429 for ; Mon, 2 Aug 2010 20:37:37 -0400 (EDT) Received: from m2.gw.fujitsu.co.jp ([10.0.50.72]) by fgwmail6.fujitsu.co.jp (Fujitsu Gateway) with ESMTP id o730evra032294 for (envelope-from kamezawa.hiroyu@jp.fujitsu.com); Tue, 3 Aug 2010 09:40:58 +0900 Received: from smail (m2 [127.0.0.1]) by outgoing.m2.gw.fujitsu.co.jp (Postfix) with ESMTP id A9A9C45DE57 for ; Tue, 3 Aug 2010 09:40:57 +0900 (JST) Received: from s2.gw.fujitsu.co.jp (s2.gw.fujitsu.co.jp [10.0.50.92]) by m2.gw.fujitsu.co.jp (Postfix) with ESMTP id 8535A45DE51 for ; Tue, 3 Aug 2010 09:40:57 +0900 (JST) Received: from s2.gw.fujitsu.co.jp (localhost.localdomain [127.0.0.1]) by s2.gw.fujitsu.co.jp (Postfix) with ESMTP id 6C9241DB803E for ; Tue, 3 Aug 2010 09:40:57 +0900 (JST) Received: from ml14.s.css.fujitsu.com (ml14.s.css.fujitsu.com [10.249.87.104]) by s2.gw.fujitsu.co.jp (Postfix) with ESMTP id 05FB11DB803C for ; Tue, 3 Aug 2010 09:40:57 +0900 (JST) Date: Tue, 3 Aug 2010 09:36:10 +0900 From: KAMEZAWA Hiroyuki Subject: Re: [patch -mm 1/2] oom: badness heuristic rewrite Message-Id: <20100803093610.f4d30ca7.kamezawa.hiroyu@jp.fujitsu.com> In-Reply-To: References: <20100730091125.4AC3.A69D9226@jp.fujitsu.com> <20100729183809.ca4ed8be.akpm@linux-foundation.org> <20100730195338.4AF6.A69D9226@jp.fujitsu.com> <20100802134312.c0f48615.akpm@linux-foundation.org> <20100803090058.48c0a0c9.kamezawa.hiroyu@jp.fujitsu.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org To: David Rientjes Cc: Andrew Morton , KOSAKI Motohiro , Nick Piggin , Oleg Nesterov , Balbir Singh , linux-mm@kvack.org List-ID: On Mon, 2 Aug 2010 17:27:13 -0700 (PDT) David Rientjes wrote: > On Tue, 3 Aug 2010, KAMEZAWA Hiroyuki wrote: > > > One reason I poitned out is that this new parameter is hard to use for admins and > > library writers. > > old oom_adj was defined as an parameter works as > > (memory usage of app)/oom_adj. > > Where are you getting this definition from? > > Disregarding all the other small adjustments in the old heuristic, a > reduced version of the formula was mm->total_vm << oom_adj. It's a > shift, not a divide. That has no sensible meaning. > yes. that was quite useless. > > new oom_score_adj was define as > > (memory usage of app * oom_score_adj)/ system_memory > > > > No, it's (rss + swap + oom_score_adj) / bound memory. It's an addition, > not a multiplication, and it's a proportion of memory the application is > bound to, not the entire system (it could be constrained by cpuset, > mempolicy, or memcg). > sorry. > > Then, an applications' oom_score on a host is quite different from on the other > > host. This operation is very new rather than a simple interface updates. > > This opinion was rejected. > > > > It wasn't rejected, I responded to your comment and you never wrote back. > The idea > I just got tired to write the same thing in many times. And I don't have strong opinions. I _know_ your patch fixes X-server problem. That was enough for me. > > Anyway, I believe the value other than OOM_DISABLE is useless, > > You're right in that OOM_DISABLE fulfills may typical use cases to simply > protect a task by making it immune to the oom killer. But there are other > use cases for the oom killer that you're perhaps not using where a > sensible userspace tunable does make a difference: the goal of the > heuristic is always to kill the task consuming the most amount of memory > to avoid killing tons of applications for subsequent page allocations. We > do run important tasks that consume lots of memory, though, and the kernel > can't possibly know about that importance. So although you may never use > a positive oom_score_adj, although others will, you probably can find a > use case for subtracting a memory quantity from a known memory hogging > task that you consider to be vital in an effort to disregard that quantity > from the score. I'm sure you'll agree it's a much more powerful (and > fine-grained) interface than oom_adj. > Yes, I agree if we can assume the admins are very clever. > > I have no concerns. I'll use memcg if I want to control this kind of things. > > > > That would work if you want to setup individual memcgs for every > application on your system, know what sane limits are for each one, and > want to incur the significant memory expense of enabling > CONFIG_CGROUP_MEM_RES_CTLR for its metadata. > Usual disto alreay enables it. Simply puts all applications to a group and disable oom and set oom_notifier. Then, - a "pop-up window" of task list will ask the user "which one do you want to kill ?" - send a packet to ask a administlation server system "which one is killable ?" or "increase memory limit" or "memory hot-add ?" Possible case will be - send SIGSTOP to all apps at OOM. - rise limit to some extent. or move a killable one to a special group. - wake up a killable one with SIGCONT. - send SIGHUP to stop it safely. "My application is killed by the system!!, without running safe emeregency code!" is the fundamental seeds of disconent. Thanks, -Kame -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org