linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [RFC][BUGFIX] memcg: rmdir doesn't return
@ 2009-06-12  5:33 Daisuke Nishimura
  2009-06-12  6:19 ` KAMEZAWA Hiroyuki
  0 siblings, 1 reply; 19+ messages in thread
From: Daisuke Nishimura @ 2009-06-12  5:33 UTC (permalink / raw)
  To: linux-mm; +Cc: KAMEZAWA Hiroyuki, Balbir Singh, Li Zefan, Daisuke Nishimura

Hi.

I found a problem about rmdir: rmdir doesn't return(or take a very very long time).
Actually, I found this problem long ago, but I've not had enough time to
track it down until the stale swap cache problem has been fixed.

The cause of this problem is the commit ec64f51545fffbc4cb968f0cea56341a4b07e85a
(cgroup: fix frequent -EBUSY at rmdir) and memcg's behavior about swap-in.

The commit introduced cgroup_rmdir_waitq and make rmdir wait until someone
(who will decrement css->refcnt to 1) wake it up.
But even after we have succeeded pre_destroy, which means mem.usage has
become 0, a process which has moved to another cgroup from the cgroup being removed
can increment mem.usage(and css->refcnt as a result) by doing swap-in.
This css->refcnt won't be dropped, that is the rmdir process won't be woken up,
until the owner process frees the page.

So, just "waking up after a while" by a patch below can fix this problem.

===
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 3737a68..2fe9645 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -2722,7 +2722,7 @@ again:
 
 	if (!cgroup_clear_css_refs(cgrp)) {
 		mutex_unlock(&cgroup_mutex);
-		schedule();
+		schedule_timeout(HZ/10);	/* don't wait forever */
 		finish_wait(&cgroup_rmdir_waitq, &wait);
 		clear_bit(CGRP_WAIT_ON_RMDIR, &cgrp->flags);
 		if (signal_pending(current))
===

But, is there any reason why we should charge a NEW swap-in'ed page to
"the group to which the swap has been charged", not to "the group in which
the process is now" ?
I agree that we should uncharge "swap" at swap-in from "the group to which
the swap has been charged", but IIUC, memcg before/without mem+swap controller behaves
as the latter about the charge of a swap-in'ed page.

I've confirmed that a patch below can also fix this rmdir problem.

===
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 6ceb6f2..dbece65 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1063,7 +1063,7 @@ static struct mem_cgroup *mem_cgroup_lookup(unsigned short id)
 
 static struct mem_cgroup *try_get_mem_cgroup_from_swapcache(struct page *page)
 {
-	struct mem_cgroup *mem;
+	struct mem_cgroup *mem = NULL;
 	struct page_cgroup *pc;
 	unsigned short id;
 	swp_entry_t ent;
@@ -1079,14 +1079,6 @@ static struct mem_cgroup *try_get_mem_cgroup_from_swapcache(struct page *page)
 		mem = pc->mem_cgroup;
 		if (mem && !css_tryget(&mem->css))
 			mem = NULL;
-	} else {
-		ent.val = page_private(page);
-		id = lookup_swap_cgroup(ent);
-		rcu_read_lock();
-		mem = mem_cgroup_lookup(id);
-		if (mem && !css_tryget(&mem->css))
-			mem = NULL;
-		rcu_read_unlock();
 	}
 	unlock_page_cgroup(pc);
 	return mem;
===


Any suggestions ?


Thanks,
Daisuke Nishimura.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2009-06-18  3:20 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-06-12  5:33 [RFC][BUGFIX] memcg: rmdir doesn't return Daisuke Nishimura
2009-06-12  6:19 ` KAMEZAWA Hiroyuki
2009-06-15  2:50   ` Daisuke Nishimura
2009-06-15  3:02     ` KAMEZAWA Hiroyuki
2009-06-15  8:17       ` KAMEZAWA Hiroyuki
2009-06-16  2:47         ` Daisuke Nishimura
2009-06-16  5:00           ` KAMEZAWA Hiroyuki
2009-06-16  6:38             ` Daisuke Nishimura
2009-06-16  6:48               ` KAMEZAWA Hiroyuki
2009-06-16  8:44                 ` KAMEZAWA Hiroyuki
2009-06-17  4:56                   ` Balbir Singh
2009-06-17  5:11                     ` KAMEZAWA Hiroyuki
2009-06-17  5:49                       ` Balbir Singh
2009-06-17  6:27                         ` KAMEZAWA Hiroyuki
2009-06-17  7:35                           ` Balbir Singh
2009-06-17  9:05                             ` KAMEZAWA Hiroyuki
2009-06-17  9:24                               ` Balbir Singh
2009-06-18  3:03                   ` Daisuke Nishimura
2009-06-18  3:21                     ` KAMEZAWA Hiroyuki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox