From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Sat, 15 Nov 2008 20:04:00 +0900 From: Daisuke Nishimura Subject: Re: [PATCH mmotm] memcg: make resize limit hold mutex Message-Id: <20081115200400.1399c7e0.d-nishimura@mtf.biglobe.ne.jp> In-Reply-To: <41265.10.75.179.62.1226745093.squirrel@webmail-b.css.fujitsu.com> References: <20081115183721.cfc1b80b.d-nishimura@mtf.biglobe.ne.jp> <41265.10.75.179.62.1226745093.squirrel@webmail-b.css.fujitsu.com> Reply-To: nishimura@mxp.nes.nec.co.jp Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: KAMEZAWA Hiroyuki Cc: linux-mm , Andrew Morton , Balbir Singh , Li Zefan , d-nishimura@mtf.biglobe.ne.jp, nishimura@mxp.nes.nec.co.jp List-ID: On Sat, 15 Nov 2008 19:31:33 +0900 (JST) "KAMEZAWA Hiroyuki" wrote: > Daisuke Nishimura said: > > > > This patch define a new mutex and make both mem_cgroup_resize_limit and > > mem_cgroup_memsw_resize_limit hold it to remove spin_lock_irqsave. > > > Thanks, > > > > > Signed-off-by: Daisuke Nishimura > > --- > > > - while (res_counter_set_limit(&memcg->res, val)) { > > + while (retry_count) { > > if (signal_pending(current)) { > > ret = -EINTR; > > break; > > } > > - if (!retry_count) { > > - ret = -EBUSY; > > - break; > > + /* > > + * Rather than hide all in some function, I do this in > > + * open coded manner. You see what this really does. > > + * We have to guarantee mem->res.limit < mem->memsw.limit. > > + */ > > + if (do_swap_account) { > > + mutex_lock(&set_limit_mutex); > > + memswlimit = res_counter_read_u64(&memcg->memsw, > > + RES_LIMIT); > > + if (memswlimit < val) { > > + ret = -EINVAL; > > + mutex_unlock(&set_limit_mutex); > > + break; > > + } > > + ret = res_counter_set_limit(&memcg->res, val); > > + mutex_unlock(&set_limit_mutex); > > } > > Maybe !do_swap_account case is not handled. Oops, you are right. > I think in !do_swap_account case, memsw.limit is inifinite. > So, just removing this "if" is ok. > > No objection to your direction, could you fix ? > Sure. This is the fixed version. Thanks, Daisuke Nishimura. ===== From: Daisuke Nishimura mem_cgroup_resize_memsw_limit() try to hold memsw.lock while holding res.lock, so below message is showed when trying to write memory.memsw.limit_in_bytes file. [ INFO: possible recursive locking detected ] 2.6.28-rc4-mm1-mmotm-2008-11-14-20-50-ef4e17ef #1 bash/4406 is trying to acquire lock: (&counter->lock){....}, at: [] mem_cgroup_resize_memsw_limit+0x8d/0x113 but task is already holding lock: (&counter->lock){....}, at: [] mem_cgroup_resize_memsw_limit+0x5b/0x113 other info that might help us debug this: 1 lock held by bash/4406: #0: (&counter->lock){....}, at: [] mem_cgroup_resize_memsw_limit+0x5b/0x113 stack backtrace: Pid: 4406, comm: bash Not tainted 2.6.28-rc4-mm1-mmotm-2008-11-14-20-50-ef4e17ef #1 Call Trace: [] ? printk+0xf/0x18 [] __lock_acquire+0xc67/0x1353 [] ? __lock_acquire+0x133a/0x1353 [] lock_acquire+0x70/0x97 [] ? mem_cgroup_resize_memsw_limit+0x8d/0x113 [] _spin_lock_irqsave+0x3a/0x6d [] ? mem_cgroup_resize_memsw_limit+0x8d/0x113 [] mem_cgroup_resize_memsw_limit+0x8d/0x113 [] ? memparse+0x14/0x66 [] mem_cgroup_write+0x4a/0x50 [] cgroup_file_write+0x181/0x1c6 [] ? lock_release_holdtime+0x1a/0x168 [] ? security_file_permission+0xf/0x11 [] ? rw_verify_area+0x76/0x97 [] ? cgroup_file_write+0x0/0x1c6 [] vfs_write+0x8a/0x12e [] sys_write+0x3b/0x60 [] sysenter_do_call+0x12/0x3f This patch define a new mutex and make both mem_cgroup_resize_limit and mem_cgroup_memsw_resize_limit hold it to remove spin_lock_irqsave. Signed-off-by: Daisuke Nishimura --- This patch can be applied after memcg-add-mem_cgroup_disabled-fix.patch. mm/memcontrol.c | 40 ++++++++++++++++++++++++++-------------- 1 files changed, 26 insertions(+), 14 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 09ce42a..691e052 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -35,6 +35,7 @@ #include #include #include +#include #include "internal.h" #include @@ -1147,27 +1148,38 @@ int mem_cgroup_shrink_usage(struct mm_struct *mm, gfp_t gfp_mask) return 0; } +static DEFINE_MUTEX(set_limit_mutex); int mem_cgroup_resize_limit(struct mem_cgroup *memcg, unsigned long long val) { int retry_count = MEM_CGROUP_RECLAIM_RETRIES; int progress; + u64 memswlimit; int ret = 0; - if (do_swap_account) { - if (val > memcg->memsw.limit) - return -EINVAL; - } - - while (res_counter_set_limit(&memcg->res, val)) { + while (retry_count) { if (signal_pending(current)) { ret = -EINTR; break; } - if (!retry_count) { - ret = -EBUSY; + /* + * Rather than hide all in some function, I do this in + * open coded manner. You see what this really does. + * We have to guarantee mem->res.limit < mem->memsw.limit. + */ + mutex_lock(&set_limit_mutex); + memswlimit = res_counter_read_u64(&memcg->memsw, RES_LIMIT); + if (memswlimit < val) { + ret = -EINVAL; + mutex_unlock(&set_limit_mutex); break; } + ret = res_counter_set_limit(&memcg->res, val); + mutex_unlock(&set_limit_mutex); + + if (!ret) + break; + progress = try_to_free_mem_cgroup_pages(memcg, GFP_HIGHUSER_MOVABLE, false); if (!progress) @@ -1180,7 +1192,6 @@ int mem_cgroup_resize_memsw_limit(struct mem_cgroup *memcg, unsigned long long val) { int retry_count = MEM_CGROUP_RECLAIM_RETRIES; - unsigned long flags; u64 memlimit, oldusage, curusage; int ret; @@ -1197,19 +1208,20 @@ int mem_cgroup_resize_memsw_limit(struct mem_cgroup *memcg, * open coded manner. You see what this really does. * We have to guarantee mem->res.limit < mem->memsw.limit. */ - spin_lock_irqsave(&memcg->res.lock, flags); - memlimit = memcg->res.limit; + mutex_lock(&set_limit_mutex); + memlimit = res_counter_read_u64(&memcg->res, RES_LIMIT); if (memlimit > val) { - spin_unlock_irqrestore(&memcg->res.lock, flags); ret = -EINVAL; + mutex_unlock(&set_limit_mutex); break; } ret = res_counter_set_limit(&memcg->memsw, val); - oldusage = memcg->memsw.usage; - spin_unlock_irqrestore(&memcg->res.lock, flags); + mutex_unlock(&set_limit_mutex); if (!ret) break; + + oldusage = res_counter_read_u64(&memcg->memsw, RES_USAGE); try_to_free_mem_cgroup_pages(memcg, GFP_HIGHUSER_MOVABLE, true); curusage = res_counter_read_u64(&memcg->memsw, RES_USAGE); if (curusage >= oldusage) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org