linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
	"balbir@linux.vnet.ibm.com" <balbir@linux.vnet.ibm.com>,
	"nishimura@mxp.nes.nec.co.jp" <nishimura@mxp.nes.nec.co.jp>,
	"lizf@cn.fujitsu.com" <lizf@cn.fujitsu.com>,
	"menage@google.com" <menage@google.com>,
	"kosaki.motohiro@jp.fujitsu.com" <kosaki.motohiro@jp.fujitsu.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: [RFC][PATCH 5/6] fix inactive_ratio under hierarchy
Date: Tue, 9 Dec 2008 20:10:23 +0900	[thread overview]
Message-ID: <20081209201023.65bb98e6.kamezawa.hiroyu@jp.fujitsu.com> (raw)
In-Reply-To: <20081209200213.0e2128c1.kamezawa.hiroyu@jp.fujitsu.com>


From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

After lru updates for memcg, followint test easily see OOM.
and memory-reclaim speed was very bad.

	mkdir /opt/cgroup/xxx
	echo 1 > /opt/cgroup/xxx/memory.use_hierarchy
	mkdir /opt/cgroup/xxx/01
	mkdir /opt/cgroup/xxx/02
	echo 40M > /opt/cgroup/xxx/memory.limit_in_bytes
	
	Run task under group 01 or 02.

This is because calclation of inactive_ratio doesn't handle hierarchy.
In above, 01 and 02's inactive_ratio = 65535 and inactive list will be
empty.

This patch tries to set 01 and 02 's inactive ration to appropriate value
under hierarchy. inactive_ratio is adjusted to the minimum limit found in
upwards in hierarchy.


ex)In following tree,
	/opt/cgroup/01		limit=1G
	/opt/cgroup/01/A	limit=500M
	/opt/cgroup/01/A/B	limit=unlimited
	/opt/cgroup/01/A/C	limit=50M
	/opt/cgroup/01/Z	limit=700M


	/opt/cgroup/01's inactive_ratio is calculated by limit of 1G.
	/opt/cgroup/01/A's inactive_ratio is calculated by limit of 500M 
	/opt/cgroup/01/A/B's inactive_ratio is calculated by limit of 500M.
	/opt/cgroup/01/A/C's inactive_ratio is calculated by limit of 50M.
	/opt/cgroup/01's inactive_ratio is calculated by limit of 700M.


Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujisu.com>

 mm/memcontrol.c |   71 ++++++++++++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 64 insertions(+), 7 deletions(-)

---
Index: mmotm-2.6.28-Dec08/mm/memcontrol.c
===================================================================
--- mmotm-2.6.28-Dec08.orig/mm/memcontrol.c
+++ mmotm-2.6.28-Dec08/mm/memcontrol.c
@@ -1382,20 +1382,73 @@ int mem_cgroup_shrink_usage(struct mm_st
  * page_alloc.c::setup_per_zone_inactive_ratio().
  * it describe more detail.
  */
-static void mem_cgroup_set_inactive_ratio(struct mem_cgroup *memcg)
+static int __mem_cgroup_inactive_ratio(unsigned long long gb)
 {
-	unsigned int gb, ratio;
+	unsigned int ratio;
 
-	gb = res_counter_read_u64(&memcg->res, RES_LIMIT) >> 30;
+	gb = gb >> 30;
 	if (gb)
 		ratio = int_sqrt(10 * gb);
 	else
 		ratio = 1;
 
-	memcg->inactive_ratio = ratio;
+	return ratio;
+}
+
+
+static void mem_cgroup_update_inactive_ratio(struct mem_cgroup *memcg)
+{
+	struct cgroup *cur;
+	struct mem_cgroup *root_memcg, *tmp;
+	unsigned long long min_limit, limit;
+	int depth, nextid, rootid, found, ratio;
+
+	if (!memcg->use_hierarchy) {
+		limit = res_counter_read_u64(&memcg->res, RES_LIMIT);
+		memcg->inactive_ratio = __mem_cgroup_inactive_ratio(limit);
+		return;
+	}
 
+	cur = memcg->css.cgroup;
+	min_limit = res_counter_read_u64(&tmp->res, RES_LIMIT);
+
+	/* go up to root cgroup and find min limit.*/
+	while (cur->parent != NULL) {
+		tmp = mem_cgroup_from_cont(cur);
+		if (!tmp->use_hierarchy)
+			break;
+		limit = res_counter_read_u64(&tmp->res, RES_LIMIT);
+		if (limit < min_limit)
+			limit = min_limit;
+		cur = cur->parent;
+	}
+	/* new inactive ratio for this hierarchy */
+	ratio = __mem_cgroup_inactive_ratio(min_limit);
+
+	/*
+	 * update inactive ratio under this.
+	 * all children's inactive_ratio will be updated.
+	 */
+	cur = memcg->css.cgroup;
+	rootid = cgroup_id(cur);
+	depth = cgroup_depth(cur);
+	nextid = 0;
+	rcu_read_lock();
+	while (1) {
+		cur = cgroup_get_next(nextid, rootid, depth, &found);
+		if (!cur)
+			break;
+		if (!cgroup_is_removed(cur)) {
+			tmp = mem_cgroup_from_cont(cur);
+			tmp->inactive_ratio = ratio;
+		}
+		nextid = found + 1;
+	}
+	rcu_read_unlock();
 }
 
+
+
 static DEFINE_MUTEX(set_limit_mutex);
 
 static int mem_cgroup_resize_limit(struct mem_cgroup *memcg,
@@ -1435,8 +1488,11 @@ static int mem_cgroup_resize_limit(struc
   		if (!progress)			retry_count--;
 	}
 
-	if (!ret)
-		mem_cgroup_set_inactive_ratio(memcg);
+	if (!ret) {
+		mutex_lock(&set_limit_mutex);
+		mem_cgroup_update_inactive_ratio(memcg);
+		mutex_unlock(&set_limit_mutex);
+	}
 
 	return ret;
 }
@@ -2081,11 +2137,12 @@ mem_cgroup_create(struct cgroup_subsys *
 	if (parent && parent->use_hierarchy) {
 		res_counter_init(&mem->res, &parent->res);
 		res_counter_init(&mem->memsw, &parent->memsw);
+		/* min_limit under hierarchy is unchanged.*/
+		mem->inactive_ratio = parent->inactive_ratio;
 	} else {
 		res_counter_init(&mem->res, NULL);
 		res_counter_init(&mem->memsw, NULL);
 	}
-	mem_cgroup_set_inactive_ratio(mem);
 	mem->last_scanned_child = 0;
 	mem->scan_age = 0;
 	spin_lock_init(&mem->reclaim_param_lock);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2008-12-09 11:11 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-12-09 11:02 [RFC][PATCH 0/6] cgroup id and mix fixes (2008/12/09) KAMEZAWA Hiroyuki
2008-12-09 11:04 ` [RFC][PATCH 1/6] memcg: Documentation for internal implementation KAMEZAWA Hiroyuki
2008-12-10  0:27   ` KAMEZAWA Hiroyuki
2008-12-10  1:02     ` Li Zefan
2008-12-10  1:07       ` KAMEZAWA Hiroyuki
2008-12-09 11:06 ` [RFC][PATCH 1/6] memcg: fix pre_destory handler KAMEZAWA Hiroyuki
2008-12-10  2:08   ` KAMEZAWA Hiroyuki
2008-12-10  2:19   ` Li Zefan
2008-12-10  2:23     ` KAMEZAWA Hiroyuki
2008-12-10  2:28   ` Daisuke Nishimura
2008-12-10  2:58     ` KAMEZAWA Hiroyuki
2008-12-10  3:03       ` Daisuke Nishimura
2008-12-10  4:17         ` KAMEZAWA Hiroyuki
2008-12-10 10:40   ` Paul Menage
2008-12-10 11:29     ` KAMEZAWA Hiroyuki
2008-12-10 13:25       ` Balbir Singh
2008-12-10 13:47         ` Daisuke Nishimura
2008-12-10 18:26           ` Paul Menage
2008-12-10 18:25         ` Paul Menage
2008-12-10 18:35       ` Paul Menage
2008-12-10 19:00         ` Paul Menage
2008-12-11  0:21           ` KAMEZAWA Hiroyuki
2008-12-11  0:24             ` Paul Menage
2008-12-11  1:06               ` KAMEZAWA Hiroyuki
2008-12-11 12:43               ` KAMEZAWA Hiroyuki
2008-12-11  0:25         ` KAMEZAWA Hiroyuki
2008-12-11  0:28           ` Paul Menage
2008-12-11  1:09             ` KAMEZAWA Hiroyuki
2008-12-09 11:08 ` [RFC][PATCH 2/6] cgroup id KAMEZAWA Hiroyuki
2008-12-09 11:09 ` [RFC][PATCH 4/6] Flat hierarchical reclaim by ID KAMEZAWA Hiroyuki
2008-12-09 12:27   ` Balbir Singh
2008-12-09 14:28     ` KAMEZAWA Hiroyuki
2008-12-09 15:46       ` Balbir Singh
2008-12-09 16:34         ` KAMEZAWA Hiroyuki
2008-12-10  2:49           ` Balbir Singh
2008-12-10  3:03             ` KAMEZAWA Hiroyuki
2008-12-09 11:10 ` KAMEZAWA Hiroyuki [this message]
2008-12-11  3:14   ` [RFC][PATCH 5/6] fix inactive_ratio under hierarchy KOSAKI Motohiro
2008-12-11  3:19     ` KAMEZAWA Hiroyuki
2008-12-09 11:12 ` [RFC][PATCH 6/6] fix oom " KAMEZAWA Hiroyuki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20081209201023.65bb98e6.kamezawa.hiroyu@jp.fujitsu.com \
    --to=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lizf@cn.fujitsu.com \
    --cc=menage@google.com \
    --cc=nishimura@mxp.nes.nec.co.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox