From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx158.postini.com [74.125.245.158]) by kanga.kvack.org (Postfix) with SMTP id 3369D6B004F for ; Sun, 4 Dec 2011 21:19:52 -0500 (EST) Received: from m4.gw.fujitsu.co.jp (unknown [10.0.50.74]) by fgwmail5.fujitsu.co.jp (Postfix) with ESMTP id B1E843EE0C2 for ; Mon, 5 Dec 2011 11:19:50 +0900 (JST) Received: from smail (m4 [127.0.0.1]) by outgoing.m4.gw.fujitsu.co.jp (Postfix) with ESMTP id 8DF8145DE51 for ; Mon, 5 Dec 2011 11:19:50 +0900 (JST) Received: from s4.gw.fujitsu.co.jp (s4.gw.fujitsu.co.jp [10.0.50.94]) by m4.gw.fujitsu.co.jp (Postfix) with ESMTP id 697BB45DE4D for ; Mon, 5 Dec 2011 11:19:50 +0900 (JST) Received: from s4.gw.fujitsu.co.jp (localhost.localdomain [127.0.0.1]) by s4.gw.fujitsu.co.jp (Postfix) with ESMTP id 5734A1DB803E for ; Mon, 5 Dec 2011 11:19:50 +0900 (JST) Received: from m107.s.css.fujitsu.com (m107.s.css.fujitsu.com [10.240.81.147]) by s4.gw.fujitsu.co.jp (Postfix) with ESMTP id F40BD1DB8040 for ; Mon, 5 Dec 2011 11:19:49 +0900 (JST) Date: Mon, 5 Dec 2011 11:18:35 +0900 From: KAMEZAWA Hiroyuki Subject: Re: [PATCH v7 10/10] Disable task moving when using kernel memory accounting Message-Id: <20111205111835.b1432603.kamezawa.hiroyu@jp.fujitsu.com> In-Reply-To: <4ED914EC.6020500@parallels.com> References: <1322611021-1730-1-git-send-email-glommer@parallels.com> <1322611021-1730-11-git-send-email-glommer@parallels.com> <20111130112210.1d979512.kamezawa.hiroyu@jp.fujitsu.com> <4ED914EC.6020500@parallels.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Glauber Costa Cc: linux-kernel@vger.kernel.org, paul@paulmenage.org, lizf@cn.fujitsu.com, ebiederm@xmission.com, davem@davemloft.net, gthelen@google.com, netdev@vger.kernel.org, linux-mm@kvack.org, kirill@shutemov.name, avagin@parallels.com, devel@openvz.org, eric.dumazet@gmail.com, cgroups@vger.kernel.org On Fri, 2 Dec 2011 16:11:56 -0200 Glauber Costa wrote: > On 11/30/2011 12:22 AM, KAMEZAWA Hiroyuki wrote: > > On Tue, 29 Nov 2011 21:57:01 -0200 > > Glauber Costa wrote: > > > >> Since this code is still experimental, we are leaving the exact > >> details of how to move tasks between cgroups when kernel memory > >> accounting is used as future work. > >> > >> For now, we simply disallow movement if there are any pending > >> accounted memory. > >> > >> Signed-off-by: Glauber Costa > >> CC: Hiroyouki Kamezawa > >> --- > >> mm/memcontrol.c | 23 ++++++++++++++++++++++- > >> 1 files changed, 22 insertions(+), 1 deletions(-) > >> > >> diff --git a/mm/memcontrol.c b/mm/memcontrol.c > >> index a31a278..dd9a6d9 100644 > >> --- a/mm/memcontrol.c > >> +++ b/mm/memcontrol.c > >> @@ -5453,10 +5453,19 @@ static int mem_cgroup_can_attach(struct cgroup_subsys *ss, > >> { > >> int ret = 0; > >> struct mem_cgroup *memcg = mem_cgroup_from_cont(cgroup); > >> + struct mem_cgroup *from = mem_cgroup_from_task(p); > >> + > >> +#if defined(CONFIG_CGROUP_MEM_RES_CTLR_KMEM)&& defined(CONFIG_INET) > >> + if (from != memcg&& !mem_cgroup_is_root(from)&& > >> + res_counter_read_u64(&from->tcp_mem.tcp_memory_allocated, RES_USAGE)) { > >> + printk(KERN_WARNING "Can't move tasks between cgroups: " > >> + "Kernel memory held.\n"); > >> + return 1; > >> + } > >> +#endif > > > > I wonder....reading all codes again, this is incorrect check. > > > > Hm, let me cralify. IIUC, in old code, "prevent moving" is because you hold > > reference count of cgroup, which can cause trouble at rmdir() as leaking refcnt. > right. > > > BTW, because socket is a shared resource between cgroup, changes in mm->owner > > may cause task cgroup moving implicitly. So, if you allow leak of resource > > here, I guess... you can take mem_cgroup_get() refcnt which is memcg-local and > > allow rmdir(). Then, this limitation may disappear. > > Sorry, I didn't fully understand. Can you clarify further? > If the task is implicitly moved, it will end up calling can_attach as > well, right? > I'm sorry that my explanation is bad. You can take memory cgroup itself's reference count by mem_cgroup_put/get. By getting this, memory cgroup object will continue to exist even after its struct cgroup* is freed by rmdir(). So, assume you do mem_cgroup_get()/put at socket attaching/detatching. 0) A task has a tcp socekts in memcg0. task(memcg0) +- socket0 --> memcg0,usage=4096 1) move this task to memcg1 task(memcg1) +- socket0 --> memcg0,usage=4096 2) The task create a new socket. task(memcg1) +- socekt0 --> memcg0,usage=4096 +- socket1 --> memcg1,usage=xxxx Here, the task will hold 4096bytes of usage in memcg0 implicitly. 3) an admin removes memcg0 task(memcg1) +- socket0 -->memcg0, usage=4096 <-----(*) +- socket1 -->memcg1, usage=xxxx (*) is invisible to users....but this will not be very big problem. Thanks, -Kame Thanks, -Kame -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: email@kvack.org