linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
To: Tejun Heo <tj@kernel.org>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
	"cgroups@vger.kernel.org" <cgroups@vger.kernel.org>,
	Michal Hocko <mhocko@suse.cz>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Han Ying <yinghan@google.com>,
	Glauber Costa <glommer@parallels.com>,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Hiroyuki Kamezawa <kamezawa.hiroyuki@gmail.com>,
	Linux Kernel <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v3][0/6] memcg: prevent -ENOMEM in pre_destroy()
Date: Fri, 22 Jun 2012 08:27:25 +0900	[thread overview]
Message-ID: <4FE3ADDD.9060908@jp.fujitsu.com> (raw)
In-Reply-To: <20120621202043.GD4642@google.com>




(2012/06/22 5:20), Tejun Heo wrote:
> On Fri, May 11, 2012 at 06:41:36PM +0900, KAMEZAWA Hiroyuki wrote:
>> Hi, here is v3 based on memcg-devel tree.
>> git://github.com/mstsxfx/memcg-devel.git
>>
>> This patch series is for avoiding -ENOMEM at calling pre_destroy()
>> which is called at rmdir(). After this patch, charges will be moved
>> to root (if use_hierarchy==0) or parent (if use_hierarchy==1), and
>> we'll not see -ENOMEM in rmdir() of cgroup.
>>
>> v2 included some other patches than ones for handling -ENOMEM problem,
>> but I divided it. I'd like to post others in different series, later.
>> No logical changes in general, maybe v3 is cleaner than v2.
>>
>> 0001 ....fix error code in memcg-hugetlb
>> 0002 ....add res_counter_uncharge_until
>> 0003 ....use res_counter_uncharge_until in memcg
>> 0004 ....move charges to root is use_hierarchy==0
>> 0005 ....cleanup for mem_cgroup_move_account()
>> 0006 ....remove warning of res_counter_uncharge_nofail (from Costa's slub accounting series).
>
> KAME, how is this progressing?  Is it stuck on anything?
>

I think I finished 80% of works and patches are in -mm stack now.
They'll be visible in -next, soon.

Remaining 20% of work is based on a modification to cgroup layer

How do you think this patch ? (This patch is not tested yet...so
may have troubles...) I think callers of pre_destory() is not so many...

==
 From a28db946f91f3509d25779e8c5db249506cc4b07 Mon Sep 17 00:00:00 2001
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Date: Fri, 22 Jun 2012 08:38:38 +0900
Subject: [PATCH] cgroup: keep cgroup_mutex() while calling ->pre_destroy()

In past, memcg's pre_destroy() was verrry slow because of the possibility
of page reclaiming in it. So, cgroup_mutex() was released before calling
pre_destroy() callbacks. Now, it's enough fast. memcg just scans the list
and move pages to other cgroup, no memory reclaim happens.
Then, we can keep cgroup_mutex() there.

By holding looks, we can avoid following cases
    1. new task is attached while rmdir().
    2. new child cgroup is created while rmdir()
    3. new task is attached to cgroup and removed from cgroup before
       checking css's count. So, ->destroy() will be called even if
       some trashes by the task remains

(3. is terrible case...even if I think it will not happen in real world..)

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
  kernel/cgroup.c |    3 +--
  1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index caff6a1..a5b6df1 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -4171,7 +4171,6 @@ again:
  		mutex_unlock(&cgroup_mutex);
  		return -EBUSY;
  	}
-	mutex_unlock(&cgroup_mutex);
  
  	/*
  	 * In general, subsystem has no css->refcnt after pre_destroy(). But
@@ -4190,11 +4189,11 @@ again:
  	 */
  	ret = cgroup_call_pre_destroy(cgrp);
  	if (ret) {
+		mutex_unlock(&cgroup_mutex);
  		clear_bit(CGRP_WAIT_ON_RMDIR, &cgrp->flags);
  		return ret;
  	}
  
-	mutex_lock(&cgroup_mutex);
  	parent = cgrp->parent;
  	if (atomic_read(&cgrp->count) || !list_empty(&cgrp->children)) {
  		clear_bit(CGRP_WAIT_ON_RMDIR, &cgrp->flags);
-- 
1.7.4.1













--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2012-06-21 23:29 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-11  9:41 KAMEZAWA Hiroyuki
2012-05-11  9:45 ` [PATCH v3 1/6] memcg: fix error code in hugetlb_force_memcg_empty() KAMEZAWA Hiroyuki
2012-05-11 21:17   ` Andrew Morton
2012-05-14  1:07     ` KAMEZAWA Hiroyuki
2012-05-14 18:15   ` Tejun Heo
2012-05-14 18:32     ` Tejun Heo
2012-05-15  1:10       ` KAMEZAWA Hiroyuki
2012-05-15 15:12         ` Tejun Heo
2012-05-11  9:47 ` [PATCH 2/6] add res_counter_uncharge_until() KAMEZAWA Hiroyuki
2012-05-11 21:19   ` Andrew Morton
2012-05-14  1:10     ` KAMEZAWA Hiroyuki
2012-05-14 10:08       ` Frederic Weisbecker
2012-05-14 10:32         ` KAMEZAWA Hiroyuki
2012-05-14 10:56           ` Frederic Weisbecker
2012-05-14 18:17           ` Tejun Heo
2012-05-11  9:48 ` [PATCH v3 3/6] memcg: use res_counter_uncharge_until in move_parent() KAMEZAWA Hiroyuki
2012-05-11  9:49 ` [PATCH v3 4/6] memcg: move charges to root cgroup if use_hierarchy=0 KAMEZAWA Hiroyuki
2012-05-14 20:14   ` Tejun Heo
2012-05-15  0:04     ` KAMEZAWA Hiroyuki
2012-05-11  9:50 ` [PATCH v3 5/6] memcg: don't uncharge in mem_cgroup_move_account KAMEZAWA Hiroyuki
2012-05-11  9:53 ` [PATCH v3 6/6] remove __must_check for res_counter_charge_nofail() KAMEZAWA Hiroyuki
2012-05-14 20:09   ` Tejun Heo
2012-05-15  0:02     ` KAMEZAWA Hiroyuki
2012-06-21 20:20 ` [PATCH v3][0/6] memcg: prevent -ENOMEM in pre_destroy() Tejun Heo
2012-06-21 23:27   ` Kamezawa Hiroyuki [this message]
2012-06-27 17:58     ` Tejun Heo
2012-06-28  8:33       ` Kamezawa Hiroyuki
2012-06-28 16:06         ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4FE3ADDD.9060908@jp.fujitsu.com \
    --to=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=cgroups@vger.kernel.org \
    --cc=fweisbec@gmail.com \
    --cc=glommer@parallels.com \
    --cc=hannes@cmpxchg.org \
    --cc=kamezawa.hiroyuki@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    --cc=tj@kernel.org \
    --cc=yinghan@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox