From: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
To: balbir@linux.vnet.ibm.com
Cc: linux-mm <linux-mm@kvack.org>,
Andrew Morton <akpm@linux-foundation.org>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Li Zefan <lizf@cn.fujitsu.com>, Paul Menage <menage@google.com>,
Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Subject: Re: [PATCH -mmotm 4/5] memcg: avoid oom during recharge at task move
Date: Tue, 24 Nov 2009 11:43:58 +0900 [thread overview]
Message-ID: <20091124114358.80e0cafe.nishimura@mxp.nes.nec.co.jp> (raw)
In-Reply-To: <20091123051041.GQ31961@balbir.in.ibm.com>
On Mon, 23 Nov 2009 10:40:41 +0530, Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> * nishimura@mxp.nes.nec.co.jp <nishimura@mxp.nes.nec.co.jp> [2009-11-19 13:30:30]:
>
> > This recharge-at-task-move feature has extra charges(pre-charges) on "to"
> > mem_cgroup during recharging. This means unnecessary oom can happen.
> >
> > This patch tries to avoid such oom.
> >
> > Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
> > ---
> > mm/memcontrol.c | 28 ++++++++++++++++++++++++++++
> > 1 files changed, 28 insertions(+), 0 deletions(-)
> >
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index df363da..3a07383 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -249,6 +249,7 @@ struct recharge_struct {
> > struct mem_cgroup *from;
> > struct mem_cgroup *to;
> > unsigned long precharge;
> > + struct task_struct *working; /* a task moving the target task */
>
> working does not sound like an appropriate name
>
Then, what's about "moving" ?
> > };
> > static struct recharge_struct recharge;
> >
> > @@ -1494,6 +1495,30 @@ static int __mem_cgroup_try_charge(struct mm_struct *mm,
> > if (mem_cgroup_check_under_limit(mem_over_limit))
> > continue;
> >
> > + /* try to avoid oom while someone is recharging */
> > + if (recharge.working && current != recharge.working) {
> > + struct mem_cgroup *dest;
> > + bool do_continue = false;
> > + /*
> > + * There is a small race that "dest" can be freed by
> > + * rmdir, so we use css_tryget().
> > + */
> > + rcu_read_lock();
> > + dest = recharge.to;
> > + if (dest && css_tryget(&dest->css)) {
> > + if (dest->use_hierarchy)
> > + do_continue = css_is_ancestor(
> > + &dest->css,
> > + &mem_over_limit->css);
> > + else
> > + do_continue = (dest == mem_over_limit);
> > + css_put(&dest->css);
> > + }
> > + rcu_read_unlock();
> > + if (do_continue)
> > + continue;
>
> IIUC, if dest is the current cgroup we are trying to charge to or an
> ancestor of the current cgroup, we don't OOM?
>
We don't OOM:
- if dest is the cgroup we are trying to charge to(in w/o hierarchy case).
- if the cgroup we are trying to charge to is the ancestor of dest(in hierarchy case).
because this feature preserves some amount of charged to dest cgroup, so we would better
to avoid oom during moving charge about dest cgroup.
BTW, the above code has a bug. We should check mem_over_limit->use_hierarchy,
not dest->use_hierarchy. Checking dest->use_hierarchy returns true even when:
/A : use_hierarchy == 0 <- mem_over_limit
00/: use_hierarchy == 1 <- dest
(IIUC, css_is_ancestor() only checks hierarchy in cgroup layer.)
And current task_in_mem_cgroup() has the same bug, which leads to killing an
innocent task(I'll check, test, and send a patch later).
> > + }
> > +
> > if (!nr_retries--) {
> > if (oom) {
> > mem_cgroup_out_of_memory(mem_over_limit, gfp_mask);
> > @@ -3474,6 +3499,7 @@ static void mem_cgroup_clear_recharge(void)
> > }
> > recharge.from = NULL;
> > recharge.to = NULL;
> > + recharge.working = NULL;
> > }
> >
> > static int mem_cgroup_can_attach(struct cgroup_subsys *ss,
> > @@ -3498,9 +3524,11 @@ static int mem_cgroup_can_attach(struct cgroup_subsys *ss,
> > VM_BUG_ON(recharge.from);
> > VM_BUG_ON(recharge.to);
> > VM_BUG_ON(recharge.precharge);
> > + VM_BUG_ON(recharge.working);
> > recharge.from = from;
> > recharge.to = mem;
> > recharge.precharge = 0;
> > + recharge.working = current;
> >
> > ret = mem_cgroup_prepare_recharge(mm);
> > if (ret)
>
> Sorry, if I missed it, but I did not see any time overhead of moving a
> task after these changes. Could you please help me understand the cost
> of moving say a task with 1G anonymous memory to another group and
> the cost of moving a task with 512MB anonymous and 512 page cache
> mapped, etc. It would be nice to understand the overall cost.
>
O.K.
I'll test programs with big anonymous pages and measure the time and report.
Regards,
Daisuke Nishimura.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-11-24 2:50 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-11-19 4:27 [PATCH -mmotm 0/5] memcg: recharge at task move (19/Nov) Daisuke Nishimura
2009-11-19 4:28 ` [PATCH -mmotm 1/5] cgroup: introduce cancel_attach() Daisuke Nishimura
2009-11-19 21:42 ` Paul Menage
2009-11-19 23:49 ` Daisuke Nishimura
2009-11-19 4:29 ` [PATCH -mmotm 2/5] memcg: add interface to recharge at task move Daisuke Nishimura
2009-11-20 15:42 ` Balbir Singh
2009-11-23 23:56 ` Daisuke Nishimura
2009-11-19 4:29 ` [PATCH -mmotm 3/5] memcg: recharge charges of anonymous page Daisuke Nishimura
2009-11-19 4:30 ` [PATCH -mmotm 4/5] memcg: avoid oom during recharge at task move Daisuke Nishimura
2009-11-23 5:10 ` Balbir Singh
2009-11-24 2:43 ` Daisuke Nishimura [this message]
2009-11-27 4:58 ` Daisuke Nishimura
2009-12-03 4:58 ` Daisuke Nishimura
2009-12-03 5:22 ` KAMEZAWA Hiroyuki
2009-12-03 6:00 ` Daisuke Nishimura
2009-12-03 7:40 ` KAMEZAWA Hiroyuki
2009-11-19 4:31 ` [PATCH -mmotm 5/5] memcg: recharge charges of anonymous swap Daisuke Nishimura
2009-11-23 6:59 ` Balbir Singh
2009-11-24 7:54 ` Daisuke Nishimura
2009-11-19 19:03 ` [PATCH -mmotm 0/5] memcg: recharge at task move (19/Nov) Balbir Singh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20091124114358.80e0cafe.nishimura@mxp.nes.nec.co.jp \
--to=nishimura@mxp.nes.nec.co.jp \
--cc=akpm@linux-foundation.org \
--cc=balbir@linux.vnet.ibm.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-mm@kvack.org \
--cc=lizf@cn.fujitsu.com \
--cc=menage@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox