From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
To: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Balbir Singh <balbir@linux.vnet.ibm.com>,
Li Zefan <lizf@cn.fujitsu.com>, Paul Menage <menage@google.com>,
linux-mm <linux-mm@kvack.org>
Subject: Re: [PATCH -mmotm 4/7] memcg: improbe performance in moving charge
Date: Fri, 4 Dec 2009 16:10:04 +0900 [thread overview]
Message-ID: <20091204161004.146ae715.kamezawa.hiroyu@jp.fujitsu.com> (raw)
In-Reply-To: <20091204145049.261b001b.nishimura@mxp.nes.nec.co.jp>
On Fri, 4 Dec 2009 14:50:49 +0900
Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> wrote:
> This patch tries to reduce overheads in moving charge by:
>
> - Instead of calling res_counter_uncharge against the old cgroup in
> __mem_cgroup_move_account everytime, call res_counter_uncharge at the end of
> task migration once.
> - Instead of calling res_counter_charge(via __mem_cgroup_try_charge) repeatedly,
> call res_counter_charge(PAGE_SIZE * count) in can_attach() if possible.
> - Adds a new arg(count) to __css_put and make it decrement the css->refcnt
> by "count", not 1.
> - Add a new function(__css_get), which takes "count" as a arg and increment
> the css->recnt by "count".
> - Instead of calling css_get/css_put repeatedly, call new __css_get/__css_put
> if possible.
> - removed css_get(&to->css) from __mem_cgroup_move_account(callers should have
> already called css_get), and removed css_put(&to->css) too, which is called by
> callers of move_account on success of move_account.
>
> These changes reduces the overhead from 1.7sec to 0.6sec to move charges of 1G
> anonymous memory in my test environment.
>
> Changelog: 2009/12/04
> - new patch
>
seems nice in general.
> Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
> ---
> include/linux/cgroup.h | 12 +++-
> kernel/cgroup.c | 5 +-
> mm/memcontrol.c | 151 +++++++++++++++++++++++++++++++-----------------
> 3 files changed, 109 insertions(+), 59 deletions(-)
>
> diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
> index d4cc200..61f75ae 100644
> --- a/include/linux/cgroup.h
> +++ b/include/linux/cgroup.h
> @@ -75,6 +75,12 @@ enum {
> CSS_REMOVED, /* This CSS is dead */
> };
>
> +/* Caller must verify that the css is not for root cgroup */
> +static inline void __css_get(struct cgroup_subsys_state *css, int count)
> +{
> + atomic_add(count, &css->refcnt);
> +}
> +
> /*
> * Call css_get() to hold a reference on the css; it can be used
> * for a reference obtained via:
> @@ -86,7 +92,7 @@ static inline void css_get(struct cgroup_subsys_state *css)
> {
> /* We don't need to reference count the root state */
> if (!test_bit(CSS_ROOT, &css->flags))
> - atomic_inc(&css->refcnt);
> + __css_get(css, 1);
> }
>
> static inline bool css_is_removed(struct cgroup_subsys_state *css)
> @@ -117,11 +123,11 @@ static inline bool css_tryget(struct cgroup_subsys_state *css)
> * css_get() or css_tryget()
> */
>
> -extern void __css_put(struct cgroup_subsys_state *css);
> +extern void __css_put(struct cgroup_subsys_state *css, int count);
> static inline void css_put(struct cgroup_subsys_state *css)
> {
> if (!test_bit(CSS_ROOT, &css->flags))
> - __css_put(css);
> + __css_put(css, 1);
> }
>
Maybe it's better to divide cgroup part in other patches. Li or Paul has to review.
> /* bits in struct cgroup flags field */
> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index d67d471..44f5924 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -3729,12 +3729,13 @@ static void check_for_release(struct cgroup *cgrp)
> }
> }
>
> -void __css_put(struct cgroup_subsys_state *css)
> +/* Caller must verify that the css is not for root cgroup */
> +void __css_put(struct cgroup_subsys_state *css, int count)
> {
> struct cgroup *cgrp = css->cgroup;
> int val;
> rcu_read_lock();
> - val = atomic_dec_return(&css->refcnt);
> + val = atomic_sub_return(count, &css->refcnt);
> if (val == 1) {
> if (notify_on_release(cgrp)) {
> set_bit(CGRP_RELEASABLE, &cgrp->flags);
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index e38f211..769b85a 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -252,6 +252,7 @@ struct move_charge_struct {
> struct mem_cgroup *from;
> struct mem_cgroup *to;
> unsigned long precharge;
> + unsigned long moved_charge;
> };
> static struct move_charge_struct mc;
>
> @@ -1532,14 +1533,23 @@ nomem:
> * This function is for that and do uncharge, put css's refcnt.
> * gotten by try_charge().
> */
> -static void mem_cgroup_cancel_charge(struct mem_cgroup *mem)
> +static void __mem_cgroup_cancel_charge(struct mem_cgroup *mem,
> + unsigned long count)
> {
> if (!mem_cgroup_is_root(mem)) {
> - res_counter_uncharge(&mem->res, PAGE_SIZE);
> + res_counter_uncharge(&mem->res, PAGE_SIZE * count);
> if (do_swap_account)
> - res_counter_uncharge(&mem->memsw, PAGE_SIZE);
> + res_counter_uncharge(&mem->memsw, PAGE_SIZE * count);
> + VM_BUG_ON(test_bit(CSS_ROOT, &mem->css.flags));
> + WARN_ON_ONCE(count > INT_MAX);
Hmm. is this WARN_ON necessary ? ...maybe res_counter_uncharge() will catch
this, anyway.
> + __css_put(&mem->css, (int)count);
> }
> - css_put(&mem->css);
> + /* we don't need css_put for root */
> +}
> +
> +static void mem_cgroup_cancel_charge(struct mem_cgroup *mem)
> +{
> + __mem_cgroup_cancel_charge(mem, 1);
> }
>
> /*
> @@ -1645,17 +1655,20 @@ static void __mem_cgroup_commit_charge(struct mem_cgroup *mem,
> * @pc: page_cgroup of the page.
> * @from: mem_cgroup which the page is moved from.
> * @to: mem_cgroup which the page is moved to. @from != @to.
> + * @uncharge: whether we should call uncharge and css_put against @from.
> *
> * The caller must confirm following.
> * - page is not on LRU (isolate_page() is useful.)
> * - the pc is locked, used, and ->mem_cgroup points to @from.
> *
> - * This function does "uncharge" from old cgroup but doesn't do "charge" to
> - * new cgroup. It should be done by a caller.
> + * This function doesn't do "charge" nor css_get to new cgroup. It should be
> + * done by a caller(__mem_cgroup_try_charge would be usefull). If @uncharge is
> + * true, this function does "uncharge" from old cgroup, but it doesn't if
> + * @uncharge is false, so a caller should do "uncharge".
> */
>
> static void __mem_cgroup_move_account(struct page_cgroup *pc,
> - struct mem_cgroup *from, struct mem_cgroup *to)
> + struct mem_cgroup *from, struct mem_cgroup *to, bool uncharge)
> {
> struct page *page;
> int cpu;
> @@ -1668,10 +1681,6 @@ static void __mem_cgroup_move_account(struct page_cgroup *pc,
> VM_BUG_ON(!PageCgroupUsed(pc));
> VM_BUG_ON(pc->mem_cgroup != from);
>
> - if (!mem_cgroup_is_root(from))
> - res_counter_uncharge(&from->res, PAGE_SIZE);
> - mem_cgroup_charge_statistics(from, pc, false);
> -
> page = pc->page;
> if (page_mapped(page) && !PageAnon(page)) {
> cpu = smp_processor_id();
> @@ -1687,12 +1696,12 @@ static void __mem_cgroup_move_account(struct page_cgroup *pc,
> __mem_cgroup_stat_add_safe(cpustat, MEM_CGROUP_STAT_FILE_MAPPED,
> 1);
> }
> + mem_cgroup_charge_statistics(from, pc, false);
> + if (uncharge)
> + /* This is not "cancel", but cancel_charge does all we need. */
> + mem_cgroup_cancel_charge(from);
>
> - if (do_swap_account && !mem_cgroup_is_root(from))
> - res_counter_uncharge(&from->memsw, PAGE_SIZE);
> - css_put(&from->css);
> -
> - css_get(&to->css);
> + /* caller should have done css_get */
> pc->mem_cgroup = to;
> mem_cgroup_charge_statistics(to, pc, true);
> /*
> @@ -1709,12 +1718,12 @@ static void __mem_cgroup_move_account(struct page_cgroup *pc,
> * __mem_cgroup_move_account()
> */
> static int mem_cgroup_move_account(struct page_cgroup *pc,
> - struct mem_cgroup *from, struct mem_cgroup *to)
> + struct mem_cgroup *from, struct mem_cgroup *to, bool uncharge)
> {
> int ret = -EINVAL;
> lock_page_cgroup(pc);
> if (PageCgroupUsed(pc) && pc->mem_cgroup == from) {
> - __mem_cgroup_move_account(pc, from, to);
> + __mem_cgroup_move_account(pc, from, to, uncharge);
> ret = 0;
> }
> unlock_page_cgroup(pc);
> @@ -1750,11 +1759,9 @@ static int mem_cgroup_move_parent(struct page_cgroup *pc,
> if (ret || !parent)
> goto put_back;
>
> - ret = mem_cgroup_move_account(pc, child, parent);
> - if (!ret)
> - css_put(&parent->css); /* drop extra refcnt by try_charge() */
> - else
> - mem_cgroup_cancel_charge(parent); /* does css_put */
> + ret = mem_cgroup_move_account(pc, child, parent, true);
> + if (ret)
> + mem_cgroup_cancel_charge(parent);
> put_back:
> putback_lru_page(page);
> put:
> @@ -3441,16 +3448,57 @@ static int mem_cgroup_populate(struct cgroup_subsys *ss,
> }
>
> /* Handlers for move charge at task migration. */
> -static int mem_cgroup_do_precharge(void)
> +#define PRECHARGE_COUNT_AT_ONCE 256
> +static int mem_cgroup_do_precharge(unsigned long count)
> {
> - int ret = -ENOMEM;
> + int ret = 0;
> + int batch_count = PRECHARGE_COUNT_AT_ONCE;
> struct mem_cgroup *mem = mc.to;
>
> - ret = __mem_cgroup_try_charge(NULL, GFP_KERNEL, &mem, false, NULL);
> - if (ret || !mem)
> - return -ENOMEM;
> -
> - mc.precharge++;
> + if (mem_cgroup_is_root(mem)) {
> + mc.precharge += count;
> + /* we don't need css_get for root */
> + return ret;
> + }
> + /* try to charge at once */
> + if (count > 1) {
> + struct res_counter *dummy;
> + /*
> + * "mem" cannot be under rmdir() because we've already checked
> + * by cgroup_lock_live_cgroup() that it is not removed and we
> + * are still under the same cgroup_mutex. So we can postpone
> + * css_get().
> + */
> + if (res_counter_charge(&mem->res, PAGE_SIZE * count, &dummy))
> + goto one_by_one;
if (do_swap_account) here.
> + if (res_counter_charge(&mem->memsw,
> + PAGE_SIZE * count, &dummy)) {
> + res_counter_uncharge(&mem->res, PAGE_SIZE * count);
> + goto one_by_one;
> + }
> + mc.precharge += count;
> + VM_BUG_ON(test_bit(CSS_ROOT, &mem->css.flags));
> + WARN_ON_ONCE(count > INT_MAX);
> + __css_get(&mem->css, (int)count);
> + return ret;
> + }
> +one_by_one:
> + /* fall back to one by one charge */
> + while (!ret && count--) {
!ret check seems unnecessary.
> + if (signal_pending(current)) {
> + ret = -EINTR;
> + break;
> + }
> + if (!batch_count--) {
> + batch_count = PRECHARGE_COUNT_AT_ONCE;
> + cond_resched();
> + }
> + ret = __mem_cgroup_try_charge(NULL, GFP_KERNEL, &mem,
> + false, NULL);
> + if (ret || !mem)
> + return -ENOMEM;
returning without uncharge here ?
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-12-04 7:13 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-12-04 5:46 [PATCH -mmotm 0/7] memcg: move charge at task migration (04/Dec) Daisuke Nishimura
2009-12-04 5:47 ` [PATCH -mmotm 1/7] cgroup: introduce cancel_attach() Daisuke Nishimura
2009-12-04 5:48 ` [PATCH -mmotm 2/7] memcg: add interface to move charge at task migration Daisuke Nishimura
2009-12-04 6:55 ` KAMEZAWA Hiroyuki
2009-12-04 5:49 ` [PATCH -mmotm 3/7] memcg: move charges of anonymous page Daisuke Nishimura
2009-12-04 5:50 ` [PATCH -mmotm 4/7] memcg: improbe performance in moving charge Daisuke Nishimura
2009-12-04 7:10 ` KAMEZAWA Hiroyuki [this message]
2009-12-04 7:29 ` Daisuke Nishimura
2009-12-04 5:51 ` [PATCH -mmotm 5/7] memcg: avoid oom during " Daisuke Nishimura
2009-12-04 7:14 ` KAMEZAWA Hiroyuki
2009-12-04 7:43 ` Daisuke Nishimura
2009-12-04 5:52 ` [PATCH -mmotm 6/7] memcg: move charges of anonymous swap Daisuke Nishimura
2009-12-04 7:32 ` KAMEZAWA Hiroyuki
2009-12-04 10:53 ` Daisuke Nishimura
2009-12-04 5:54 ` [PATCH -mmotm 7/7] memcg: improbe performance in moving swap charge Daisuke Nishimura
2009-12-04 6:53 ` [PATCH -mmotm 0/7] memcg: move charge at task migration (04/Dec) KAMEZAWA Hiroyuki
2009-12-04 7:00 ` KAMEZAWA Hiroyuki
2009-12-07 6:34 ` Daisuke Nishimura
2009-12-09 0:21 ` KAMEZAWA Hiroyuki
2009-12-04 7:09 ` Daisuke Nishimura
2009-12-04 7:34 ` KAMEZAWA Hiroyuki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20091204161004.146ae715.kamezawa.hiroyu@jp.fujitsu.com \
--to=kamezawa.hiroyu@jp.fujitsu.com \
--cc=akpm@linux-foundation.org \
--cc=balbir@linux.vnet.ibm.com \
--cc=linux-mm@kvack.org \
--cc=lizf@cn.fujitsu.com \
--cc=menage@google.com \
--cc=nishimura@mxp.nes.nec.co.jp \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox