From: Balbir Singh <balbir@linux.vnet.ibm.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Hugh Dickins <hugh@veritas.com>,
YAMAMOTO Takashi <yamamoto@valinux.co.jp>,
Paul Menage <menage@google.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
Balbir Singh <balbir@linux.vnet.ibm.com>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Subject: [2/5] memrlimit handle attach_task() failure, add can_attach() callback
Date: Thu, 26 Jun 2008 14:58:43 +0530 [thread overview]
Message-ID: <20080626092843.16841.59163.sendpatchset@balbir-laptop> (raw)
In-Reply-To: <20080626092815.16841.54817.sendpatchset@balbir-laptop>
Changelog v2->v1
1. Rename res_counter_add_check() to res_counter_can_add()
Making the first argument (struct res_counter *) a constant pointer causes
the compiler to spew out warnings in spin_(un)lock_irq* routines, since
we now pass address from a constant pointer to the lock routines.
This patch fixes a task migration problem reported by Kamezawa-San. This
patch should fix all issues with migraiton, except for a rare condition
documented in memrlimit_cgroup_move_task(). To fix that problem, we
would need to add transaction properties to cgroups.
The problem reported was that migrating to a group that did not have
sufficient limits to accept an incoming task caused a kernel warning.
Steps to reproduce
% mkdir /dev/cgroup/memrlimit/group_01
% mkdir /dev/cgroup/memrlimit/group_02
% echo 1G > /dev/cgroup/memrlimit/group_01/memrlimit.limit_in_bytes
% echo 0 > /dev/cgroup/memrlimit/group_02/memrlimit.limit_in_bytes
% echo $$ > /dev/cgroup/memrlimit/group_01/tasks
% echo $$ > /dev/cgroup/memrlimit/group_02/tasks
% exit
memrlimit does the right thing by not moving the charges to group_02,
but the task is still put into g2 (since we did not use can_attach to
fail migration). Once in g2, when we echo the task to the root cgroup,
it tries to uncharge the cost of the task from g2. g2 does not have
any charge associated with the task, hence we get a warning.
Reported-by: kamezawa.hiroyu@jp.fujitsu.com
Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
---
include/linux/res_counter.h | 18 ++++++++++++++++++
mm/memrlimitcgroup.c | 43 +++++++++++++++++++++++++++++++++++++++++++
2 files changed, 61 insertions(+)
diff -puN mm/memrlimitcgroup.c~memrlimit-cgroup-fix-attach-task mm/memrlimitcgroup.c
--- linux-2.6.26-rc5/mm/memrlimitcgroup.c~memrlimit-cgroup-fix-attach-task 2008-06-26 14:42:21.000000000 +0530
+++ linux-2.6.26-rc5-balbir/mm/memrlimitcgroup.c 2008-06-26 14:42:21.000000000 +0530
@@ -166,6 +166,38 @@ static int memrlimit_cgroup_populate(str
ARRAY_SIZE(memrlimit_cgroup_files));
}
+static int memrlimit_cgroup_can_move_task(struct cgroup_subsys *ss,
+ struct cgroup *cgrp,
+ struct task_struct *p)
+{
+ struct mm_struct *mm;
+ struct memrlimit_cgroup *memrcg;
+ int ret = 0;
+
+ mm = get_task_mm(p);
+ if (mm == NULL)
+ return -EINVAL;
+
+ /*
+ * Hold mmap_sem, so that total_vm does not change underneath us
+ */
+ down_read(&mm->mmap_sem);
+
+ rcu_read_lock();
+ if (p != rcu_dereference(mm->owner))
+ goto out;
+
+ memrcg = memrlimit_cgroup_from_cgrp(cgrp);
+
+ if (!res_counter_can_add(&memrcg->as_res, (mm->total_vm << PAGE_SHIFT)))
+ ret = -ENOMEM;
+out:
+ rcu_read_unlock();
+ up_read(&mm->mmap_sem);
+ mmput(mm);
+ return ret;
+}
+
static void memrlimit_cgroup_move_task(struct cgroup_subsys *ss,
struct cgroup *cgrp,
struct cgroup *old_cgrp,
@@ -193,6 +225,16 @@ static void memrlimit_cgroup_move_task(s
if (memrcg == old_memrcg)
goto out;
+ /*
+ * TBD: Even though we do the necessary checks in can_attach(),
+ * by the time we come here, there is a chance that we still
+ * fail (the memrlimit cgroup has grown its usage, and the
+ * addition of total_vm will no longer fit into its limit)
+ *
+ * We need transactional support in cgroups to let us know
+ * if can_attach() has failed and call attach_failed() on
+ * cgroups for which can_attach() succeeded.
+ */
if (res_counter_charge(&memrcg->as_res, (mm->total_vm << PAGE_SHIFT)))
goto out;
res_counter_uncharge(&old_memrcg->as_res, (mm->total_vm << PAGE_SHIFT));
@@ -231,6 +273,7 @@ struct cgroup_subsys memrlimit_cgroup_su
.destroy = memrlimit_cgroup_destroy,
.populate = memrlimit_cgroup_populate,
.attach = memrlimit_cgroup_move_task,
+ .can_attach = memrlimit_cgroup_can_move_task,
.mm_owner_changed = memrlimit_cgroup_mm_owner_changed,
.early_init = 0,
};
diff -puN kernel/res_counter.c~memrlimit-cgroup-fix-attach-task kernel/res_counter.c
diff -puN include/linux/res_counter.h~memrlimit-cgroup-fix-attach-task include/linux/res_counter.h
--- linux-2.6.26-rc5/include/linux/res_counter.h~memrlimit-cgroup-fix-attach-task 2008-06-26 14:42:21.000000000 +0530
+++ linux-2.6.26-rc5-balbir/include/linux/res_counter.h 2008-06-26 14:44:39.000000000 +0530
@@ -153,4 +153,22 @@ static inline void res_counter_reset_fai
cnt->failcnt = 0;
spin_unlock_irqrestore(&cnt->lock, flags);
}
+
+/*
+ * Add the value val to the resource counter and check if we are
+ * still under the limit.
+ */
+static inline bool res_counter_can_add(struct res_counter *cnt,
+ unsigned long val)
+{
+ bool ret = false;
+ unsigned long flags;
+
+ spin_lock_irqsave(&cnt->lock, flags);
+ if (cnt->usage + val <= cnt->limit)
+ ret = true;
+ spin_unlock_irqrestore(&cnt->lock, flags);
+ return ret;
+}
+
#endif
_
--
Warm Regards,
Balbir Singh
Linux Technology Center
IBM, ISTL
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2008-06-26 9:28 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-06-26 9:28 [0/5] memrlimit fixes Balbir Singh
2008-06-26 9:28 ` [1/5] memrlimit improve error handling Balbir Singh
2008-06-26 9:28 ` Balbir Singh [this message]
2008-06-26 9:28 ` [3/5] memrlimit fix sleep inside sleeplock in mm_update_next_owner() Balbir Singh
2008-06-26 9:29 ` [4/5] memrlimit improve fork and error handling Balbir Singh
2008-06-26 9:29 ` [5/5] memrlimit correct mremap and move_vma accounting Balbir Singh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080626092843.16841.59163.sendpatchset@balbir-laptop \
--to=balbir@linux.vnet.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=hugh@veritas.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=menage@google.com \
--cc=yamamoto@valinux.co.jp \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox