From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail172.messagelabs.com (mail172.messagelabs.com [216.82.254.3]) by kanga.kvack.org (Postfix) with SMTP id 25F476B007B for ; Mon, 8 Feb 2010 20:38:12 -0500 (EST) Received: from m3.gw.fujitsu.co.jp ([10.0.50.73]) by fgwmail5.fujitsu.co.jp (Fujitsu Gateway) with ESMTP id o191c9qM003052 for (envelope-from kamezawa.hiroyu@jp.fujitsu.com); Tue, 9 Feb 2010 10:38:09 +0900 Received: from smail (m3 [127.0.0.1]) by outgoing.m3.gw.fujitsu.co.jp (Postfix) with ESMTP id 3B21845DE51 for ; Tue, 9 Feb 2010 10:38:09 +0900 (JST) Received: from s3.gw.fujitsu.co.jp (s3.gw.fujitsu.co.jp [10.0.50.93]) by m3.gw.fujitsu.co.jp (Postfix) with ESMTP id 01E8C45DE50 for ; Tue, 9 Feb 2010 10:38:09 +0900 (JST) Received: from s3.gw.fujitsu.co.jp (localhost.localdomain [127.0.0.1]) by s3.gw.fujitsu.co.jp (Postfix) with ESMTP id C20631DB803C for ; Tue, 9 Feb 2010 10:38:08 +0900 (JST) Received: from m108.s.css.fujitsu.com (m108.s.css.fujitsu.com [10.249.87.108]) by s3.gw.fujitsu.co.jp (Postfix) with ESMTP id 680821DB8037 for ; Tue, 9 Feb 2010 10:38:08 +0900 (JST) Date: Tue, 9 Feb 2010 10:34:41 +0900 From: KAMEZAWA Hiroyuki Subject: Re: [BUGFIX][PATCH] memcg: fix oom killer kills a task in other cgroup Message-Id: <20100209103441.4c86b97e.kamezawa.hiroyu@jp.fujitsu.com> In-Reply-To: <28c262361002081724l1b64e316v3141fb4567dbf905@mail.gmail.com> References: <20100205093932.1dcdeb5f.kamezawa.hiroyu@jp.fujitsu.com> <28c262361002050830m7519f1c3y8860540708527fc0@mail.gmail.com> <20100209093246.36c50bae.kamezawa.hiroyu@jp.fujitsu.com> <28c262361002081724l1b64e316v3141fb4567dbf905@mail.gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: owner-linux-mm@kvack.org To: Minchan Kim Cc: "linux-mm@kvack.org" , "balbir@linux.vnet.ibm.com" , "nishimura@mxp.nes.nec.co.jp" , "akpm@linux-foundation.org" , rientjes@google.com List-ID: On Tue, 9 Feb 2010 10:24:45 +0900 Minchan Kim wrote: > On Tue, Feb 9, 2010 at 9:32 AM, KAMEZAWA Hiroyuki > wrote: > > On Sat, 6 Feb 2010 01:30:49 +0900 > > Minchan Kim wrote: > > > >> Hi, Kame. > >> > >> On Fri, Feb 5, 2010 at 9:39 AM, KAMEZAWA Hiroyuki > >> wrote: > >> > Please take this patch in different context with recent discussion. > >> > This is a quick-fix for a terrible bug. > >> > > >> > This patch itself is against mmotm but can be easily applied to mainline or > >> > stable tree, I think. (But I don't CC stable tree until I get ack.) > >> > > >> > == > >> > Now, oom-killer kills process's chidlren at first. But this means > >> > a child in other cgroup can be killed. But it's not checked now. > >> > > >> > This patch fixes that. > >> > > >> > CC: Balbir Singh > >> > CC: Daisuke Nishimura > >> > Signed-off-by: KAMEZAWA Hiroyuki > >> > --- > >> > A mm/oom_kill.c | A A 3 +++ > >> > A 1 file changed, 3 insertions(+) > >> > > >> > Index: mmotm-2.6.33-Feb03/mm/oom_kill.c > >> > =================================================================== > >> > --- mmotm-2.6.33-Feb03.orig/mm/oom_kill.c > >> > +++ mmotm-2.6.33-Feb03/mm/oom_kill.c > >> > @@ -459,6 +459,9 @@ static int oom_kill_process(struct task_ > >> > A A A A list_for_each_entry(c, &p->children, sibling) { > >> > A A A A A A A A if (c->mm == p->mm) > >> > A A A A A A A A A A A A continue; > >> > + A A A A A A A /* Children may be in other cgroup */ > >> > + A A A A A A A if (mem && !task_in_mem_cgroup(c, mem)) > >> > + A A A A A A A A A A A continue; > >> > A A A A A A A A if (!oom_kill_task(c)) > >> > A A A A A A A A A A A A return 0; > >> > A A A A } > >> > > >> > -- > >> > >> I am worried about latency of OOM at worst case. > >> I mean that task_in_mem_cgroup calls task_lock of child. > >> We have used task_lock in many place. > >> Some place task_lock hold and then other locks. > >> For example, exit_fs held task_lock and try to hold write_lock of fs->lock. > >> If child already hold task_lock and wait to write_lock of fs->lock, OOM latency > >> is dependent of fs->lock. > >> > >> I am not sure how many usecase is also dependent of other locks. > >> If it is not as is, we can't make sure in future. > >> > >> So How about try_task_in_mem_cgroup? > >> If we can't hold task_lock, let's continue next child. > >> > > It's recommended not to use trylock in unclear case. > > > > Then, I think possible replacement will be not-to-use any lock in > > task_in_mem_cgroup. In my short consideration, I don't think task_lock > > is necessary if we can add some tricks and memory barrier. > > > > Please let this patch to go as it is because this is an obvious bug fix > > and give me time. > > I think it's not only a latency problem of OOM but it is also a > problem of deadlock. > We can't expect child's lock state in oom_kill_process. > yes. > So if you can remove lock like below your suggestion, I am OKAY. > I'll try. I don't like both mm->owner and children-kills and now they annoy me ;) For mm, I'll prepare lockless version. For stable tree, I'll prepare trylock version. Thanks, -Kame -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org