From: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: nishimura@mxp.nes.nec.co.jp,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"balbir@linux.vnet.ibm.com" <balbir@linux.vnet.ibm.com>,
"lizf@cn.fujitsu.com" <lizf@cn.fujitsu.com>,
"menage@google.com" <menage@google.com>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>
Subject: Re: [PATCH 1/2] memcg: cgroup fix rmdir hang
Date: Wed, 24 Jun 2009 13:05:12 +0900 [thread overview]
Message-ID: <20090624130512.9e4550c4.nishimura@mxp.nes.nec.co.jp> (raw)
In-Reply-To: <20090623160854.93abeecb.kamezawa.hiroyu@jp.fujitsu.com>
On Tue, 23 Jun 2009 16:08:54 +0900, KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
>
> Now, cgroup has a logic to wait until ready-to-rmdir for avoiding
> frequent -EBUSY at rmdir.
> (See Commit ec64f51545fffbc4cb968f0cea56341a4b07e85a
> cgroup: fix frequent -EBUSY at rmdir.
>
> Nishimura-san reported bad case for waiting and This is a fix to
> make it reliable. A thread waiting for thread cannot be waken up
> when a refcnt gotten by css_tryget() isn't put immediately.
> (Original code assumed css_put() will be called soon.)
>
> memcg has this case and this is a fix for the problem. This adds
> retry_rmdir() callback to subsys and check we can sleep or not.
>
> Note: another solution will be adding "rmdir state" to subsys.
> But it will be much complicated than this do-enough-check solution.
>
> Changelog v1 -> v2:
> - splitted into 2 patches. This just includes retry_rmdir() modification.
>
> Reported-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Looks good to me.
Reviewed-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
I've been testing with [2/2] applied (because both of these patches are
necessary to fix this problem completely), and it works fine so far.
But I want to test in more test cases.
Thanks,
Daisuke Nishimura.
> ---
> Documentation/cgroups/cgroups.txt | 11 +++++++++++
> include/linux/cgroup.h | 1 +
> kernel/cgroup.c | 20 +++++++++++++++++++-
> mm/memcontrol.c | 14 ++++++++++++--
> 4 files changed, 43 insertions(+), 3 deletions(-)
>
> Index: fix-rmdir-cgroup/include/linux/cgroup.h
> ===================================================================
> --- fix-rmdir-cgroup.orig/include/linux/cgroup.h
> +++ fix-rmdir-cgroup/include/linux/cgroup.h
> @@ -374,6 +374,7 @@ struct cgroup_subsys {
> struct cgroup_subsys_state *(*create)(struct cgroup_subsys *ss,
> struct cgroup *cgrp);
> int (*pre_destroy)(struct cgroup_subsys *ss, struct cgroup *cgrp);
> + int (*retry_rmdir)(struct cgroup_subsys *ss, struct cgroup *cgrp);
> void (*destroy)(struct cgroup_subsys *ss, struct cgroup *cgrp);
> int (*can_attach)(struct cgroup_subsys *ss,
> struct cgroup *cgrp, struct task_struct *tsk);
> Index: fix-rmdir-cgroup/kernel/cgroup.c
> ===================================================================
> --- fix-rmdir-cgroup.orig/kernel/cgroup.c
> +++ fix-rmdir-cgroup/kernel/cgroup.c
> @@ -636,6 +636,23 @@ static int cgroup_call_pre_destroy(struc
> }
> return ret;
> }
> +/*
> + * Call subsys's retry_rmdir() handler. If this returns non-Zero, we retry
> + * rmdir immediately and call pre_destroy again.
> + */
> +static int cgroup_check_retry_rmdir(struct cgroup *cgrp)
> +{
> + struct cgroup_subsys *ss;
> + int ret = 0;
> +
> + for_each_subsys(cgrp->root, ss)
> + if (ss->pre_destroy) {
> + ret = ss->retry_rmdir(ss, cgrp);
> + if (ret)
> + break;
> + }
> + return ret;
> +}
>
> static void free_cgroup_rcu(struct rcu_head *obj)
> {
> @@ -2722,7 +2739,8 @@ again:
>
> if (!cgroup_clear_css_refs(cgrp)) {
> mutex_unlock(&cgroup_mutex);
> - schedule();
> + if (!cgroup_check_retry_rmdir(cgrp))
> + schedule();
> finish_wait(&cgroup_rmdir_waitq, &wait);
> clear_bit(CGRP_WAIT_ON_RMDIR, &cgrp->flags);
> if (signal_pending(current))
> Index: fix-rmdir-cgroup/mm/memcontrol.c
> ===================================================================
> --- fix-rmdir-cgroup.orig/mm/memcontrol.c
> +++ fix-rmdir-cgroup/mm/memcontrol.c
> @@ -1457,8 +1457,6 @@ __mem_cgroup_commit_charge_swapin(struct
> }
> rcu_read_unlock();
> }
> - /* add this page(page_cgroup) to the LRU we want. */
> -
> }
>
> void mem_cgroup_commit_charge_swapin(struct page *page, struct mem_cgroup *ptr)
> @@ -2571,6 +2569,17 @@ static int mem_cgroup_pre_destroy(struct
> return mem_cgroup_force_empty(mem, false);
> }
>
> +static int mem_cgroup_retry_rmdir(struct cgroup_subsys *ss,
> + struct cgroup *cont)
> +{
> + struct mem_cgroup *mem = mem_cgroup_from_cont(cont);
> +
> + if (res_counter_read_u64(&mem->res, RES_USAGE))
> + return 1;
> + return 0;
> +}
> +
> +
> static void mem_cgroup_destroy(struct cgroup_subsys *ss,
> struct cgroup *cont)
> {
> @@ -2610,6 +2619,7 @@ struct cgroup_subsys mem_cgroup_subsys =
> .subsys_id = mem_cgroup_subsys_id,
> .create = mem_cgroup_create,
> .pre_destroy = mem_cgroup_pre_destroy,
> + .retry_rmdir = mem_cgroup_retry_rmdir,
> .destroy = mem_cgroup_destroy,
> .populate = mem_cgroup_populate,
> .attach = mem_cgroup_move_task,
> Index: fix-rmdir-cgroup/Documentation/cgroups/cgroups.txt
> ===================================================================
> --- fix-rmdir-cgroup.orig/Documentation/cgroups/cgroups.txt
> +++ fix-rmdir-cgroup/Documentation/cgroups/cgroups.txt
> @@ -500,6 +500,17 @@ there are not tasks in the cgroup. If pr
> rmdir() will fail with it. From this behavior, pre_destroy() can be
> called multiple times against a cgroup.
>
> +int retry_rmdir(struct cgroup_subsys *ss, struct cgroup *cgrp);
> +
> +Called at rmdir right after the kernel finds there are remaining refcnt on
> +subsystems after pre_destroy(). When retry_rmdir() returns 0, the caller enter
> +sleep and wakes up when css's refcnt goes down to 0 by css_put().
> +When this returns 1, the caller doesn't sleep and retry rmdir immediately.
> +This is useful when the subsys knows remaining css's refcnt is not temporal
> +and to calling pre_destroy() again is proper way to remove that.
> +(or proper way to retrun -EBUSY.)
> +
> +
> int can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
> struct task_struct *task)
> (cgroup_mutex held by caller)
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-06-24 4:08 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-06-23 7:07 [PATCH 0/2] " KAMEZAWA Hiroyuki
2009-06-23 7:08 ` [PATCH 1/2] " KAMEZAWA Hiroyuki
2009-06-24 4:05 ` Daisuke Nishimura [this message]
2009-06-25 21:50 ` Andrew Morton
2009-06-26 0:14 ` KAMEZAWA Hiroyuki
2009-06-23 7:10 ` [PATCH 2/2] memcg: wakeup rmdir waiter if necessary KAMEZAWA Hiroyuki
2009-06-25 21:28 ` [PATCH 0/2] memcg: cgroup fix rmdir hang Andrew Morton
2009-06-25 23:29 ` KAMEZAWA Hiroyuki
2009-06-26 5:10 ` [PATCH] " KAMEZAWA Hiroyuki
2009-06-26 6:00 ` Daisuke Nishimura
2009-06-29 4:10 ` Daisuke Nishimura
2009-06-26 20:16 ` Paul Menage
2009-06-28 23:50 ` KAMEZAWA Hiroyuki
2009-06-28 23:56 ` KAMEZAWA Hiroyuki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090624130512.9e4550c4.nishimura@mxp.nes.nec.co.jp \
--to=nishimura@mxp.nes.nec.co.jp \
--cc=akpm@linux-foundation.org \
--cc=balbir@linux.vnet.ibm.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lizf@cn.fujitsu.com \
--cc=menage@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox