linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
To: linux-mm <linux-mm@kvack.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Balbir Singh <balbir@linux.vnet.ibm.com>,
	Li Zefan <lizf@cn.fujitsu.com>,
	Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Subject: [RFC][BUGFIX] memcg: rmdir doesn't return
Date: Fri, 12 Jun 2009 14:33:46 +0900	[thread overview]
Message-ID: <20090612143346.68e1f006.nishimura@mxp.nes.nec.co.jp> (raw)

Hi.

I found a problem about rmdir: rmdir doesn't return(or take a very very long time).
Actually, I found this problem long ago, but I've not had enough time to
track it down until the stale swap cache problem has been fixed.

The cause of this problem is the commit ec64f51545fffbc4cb968f0cea56341a4b07e85a
(cgroup: fix frequent -EBUSY at rmdir) and memcg's behavior about swap-in.

The commit introduced cgroup_rmdir_waitq and make rmdir wait until someone
(who will decrement css->refcnt to 1) wake it up.
But even after we have succeeded pre_destroy, which means mem.usage has
become 0, a process which has moved to another cgroup from the cgroup being removed
can increment mem.usage(and css->refcnt as a result) by doing swap-in.
This css->refcnt won't be dropped, that is the rmdir process won't be woken up,
until the owner process frees the page.

So, just "waking up after a while" by a patch below can fix this problem.

===
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 3737a68..2fe9645 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -2722,7 +2722,7 @@ again:
 
 	if (!cgroup_clear_css_refs(cgrp)) {
 		mutex_unlock(&cgroup_mutex);
-		schedule();
+		schedule_timeout(HZ/10);	/* don't wait forever */
 		finish_wait(&cgroup_rmdir_waitq, &wait);
 		clear_bit(CGRP_WAIT_ON_RMDIR, &cgrp->flags);
 		if (signal_pending(current))
===

But, is there any reason why we should charge a NEW swap-in'ed page to
"the group to which the swap has been charged", not to "the group in which
the process is now" ?
I agree that we should uncharge "swap" at swap-in from "the group to which
the swap has been charged", but IIUC, memcg before/without mem+swap controller behaves
as the latter about the charge of a swap-in'ed page.

I've confirmed that a patch below can also fix this rmdir problem.

===
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 6ceb6f2..dbece65 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1063,7 +1063,7 @@ static struct mem_cgroup *mem_cgroup_lookup(unsigned short id)
 
 static struct mem_cgroup *try_get_mem_cgroup_from_swapcache(struct page *page)
 {
-	struct mem_cgroup *mem;
+	struct mem_cgroup *mem = NULL;
 	struct page_cgroup *pc;
 	unsigned short id;
 	swp_entry_t ent;
@@ -1079,14 +1079,6 @@ static struct mem_cgroup *try_get_mem_cgroup_from_swapcache(struct page *page)
 		mem = pc->mem_cgroup;
 		if (mem && !css_tryget(&mem->css))
 			mem = NULL;
-	} else {
-		ent.val = page_private(page);
-		id = lookup_swap_cgroup(ent);
-		rcu_read_lock();
-		mem = mem_cgroup_lookup(id);
-		if (mem && !css_tryget(&mem->css))
-			mem = NULL;
-		rcu_read_unlock();
 	}
 	unlock_page_cgroup(pc);
 	return mem;
===


Any suggestions ?


Thanks,
Daisuke Nishimura.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

             reply	other threads:[~2009-06-12  5:42 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-06-12  5:33 Daisuke Nishimura [this message]
2009-06-12  6:19 ` KAMEZAWA Hiroyuki
2009-06-15  2:50   ` Daisuke Nishimura
2009-06-15  3:02     ` KAMEZAWA Hiroyuki
2009-06-15  8:17       ` KAMEZAWA Hiroyuki
2009-06-16  2:47         ` Daisuke Nishimura
2009-06-16  5:00           ` KAMEZAWA Hiroyuki
2009-06-16  6:38             ` Daisuke Nishimura
2009-06-16  6:48               ` KAMEZAWA Hiroyuki
2009-06-16  8:44                 ` KAMEZAWA Hiroyuki
2009-06-17  4:56                   ` Balbir Singh
2009-06-17  5:11                     ` KAMEZAWA Hiroyuki
2009-06-17  5:49                       ` Balbir Singh
2009-06-17  6:27                         ` KAMEZAWA Hiroyuki
2009-06-17  7:35                           ` Balbir Singh
2009-06-17  9:05                             ` KAMEZAWA Hiroyuki
2009-06-17  9:24                               ` Balbir Singh
2009-06-18  3:03                   ` Daisuke Nishimura
2009-06-18  3:21                     ` KAMEZAWA Hiroyuki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090612143346.68e1f006.nishimura@mxp.nes.nec.co.jp \
    --to=nishimura@mxp.nes.nec.co.jp \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-mm@kvack.org \
    --cc=lizf@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox