From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"balbir@linux.vnet.ibm.com" <balbir@linux.vnet.ibm.com>,
"nishimura@mxp.nes.nec.co.jp" <nishimura@mxp.nes.nec.co.jp>
Subject: Re: [PATCH 0/2] memcg: improving scalability by reducing lock contention at charge/uncharge
Date: Mon, 5 Oct 2009 16:18:08 +0900 [thread overview]
Message-ID: <20091005161808.fa5ab0c6.kamezawa.hiroyu@jp.fujitsu.com> (raw)
In-Reply-To: <20091002175310.0991139c.kamezawa.hiroyu@jp.fujitsu.com>
On Fri, 2 Oct 2009 17:53:10 +0900
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> > [After]
> > Performance counter stats for './runpause.sh' (5 runs):
> >
> > 474658.997489 task-clock-msecs # 7.891 CPUs ( +- 0.006% )
> > 10250 context-switches # 0.000 M/sec ( +- 0.020% )
> > 11 CPU-migrations # 0.000 M/sec ( +- 0.000% )
> > 33177858 page-faults # 0.070 M/sec ( +- 0.152% )
> > 1485264748476 cycles # 3129.120 M/sec ( +- 0.021% )
> > 409847004519 instructions # 0.276 IPC ( +- 0.123% )
> > 3237478723 cache-references # 6.821 M/sec ( +- 0.574% )
> > 1182572827 cache-misses # 2.491 M/sec ( +- 0.179% )
> >
> > 60.151786309 seconds time elapsed ( +- 0.014% )
> >
> BTW, this is a score in root cgroup.
>
>
> 473811.590852 task-clock-msecs # 7.878 CPUs ( +- 0.006% )
> 10257 context-switches # 0.000 M/sec ( +- 0.049% )
> 10 CPU-migrations # 0.000 M/sec ( +- 0.000% )
> 36418112 page-faults # 0.077 M/sec ( +- 0.195% )
> 1482880352588 cycles # 3129.684 M/sec ( +- 0.011% )
> 410948762898 instructions # 0.277 IPC ( +- 0.123% )
> 3182986911 cache-references # 6.718 M/sec ( +- 0.555% )
> 1147144023 cache-misses # 2.421 M/sec ( +- 0.137% )
>
>
> Then,
> 36418112 x 100 / 33177858 = 109% slower in children cgroup.
>
This is an additional patch now under testing.(just experimental)
result of above test:
==
[root cgroup]
37062405 page-faults # 0.078 M/sec ( +- 0.156% )
[children]
35876894 page-faults # 0.076 M/sec ( +- 0.233% )
==
Near to my target....
This patch adds bulk_css_put() and coalesces css_put() in batched-uncharge path.
avoidng frequent calls css_put().
coalescing-uncharge patch, it reduces reference to res_counter
but css_put() per page is still called.
Of course, we can coalesce prural css_put() to a call of bulk_css_put().
This patch adds bulk_css_put() and reduces false-sharing and will have
good effects in scalability.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
include/linux/cgroup.h | 10 ++++++++--
kernel/cgroup.c | 5 ++---
mm/memcontrol.c | 16 +++++++++++-----
3 files changed, 21 insertions(+), 10 deletions(-)
Index: mmotm-2.6.31-Sep28/include/linux/cgroup.h
===================================================================
--- mmotm-2.6.31-Sep28.orig/include/linux/cgroup.h
+++ mmotm-2.6.31-Sep28/include/linux/cgroup.h
@@ -117,11 +117,17 @@ static inline bool css_tryget(struct cgr
* css_get() or css_tryget()
*/
-extern void __css_put(struct cgroup_subsys_state *css);
+extern void __css_put(struct cgroup_subsys_state *css, int val);
static inline void css_put(struct cgroup_subsys_state *css)
{
if (!test_bit(CSS_ROOT, &css->flags))
- __css_put(css);
+ __css_put(css, 1);
+}
+
+static inline void bulk_css_put(struct cgroup_subsys_state *css, int val)
+{
+ if (!test_bit(CSS_ROOT, &css->flags))
+ __css_put(css, val);
}
/* bits in struct cgroup flags field */
Index: mmotm-2.6.31-Sep28/kernel/cgroup.c
===================================================================
--- mmotm-2.6.31-Sep28.orig/kernel/cgroup.c
+++ mmotm-2.6.31-Sep28/kernel/cgroup.c
@@ -3705,12 +3705,11 @@ static void check_for_release(struct cgr
}
}
-void __css_put(struct cgroup_subsys_state *css)
+void __css_put(struct cgroup_subsys_state *css, int val)
{
struct cgroup *cgrp = css->cgroup;
- int val;
rcu_read_lock();
- val = atomic_dec_return(&css->refcnt);
+ val = atomic_sub_return(val, &css->refcnt);
if (val == 1) {
if (notify_on_release(cgrp)) {
set_bit(CGRP_RELEASABLE, &cgrp->flags);
Index: mmotm-2.6.31-Sep28/mm/memcontrol.c
===================================================================
--- mmotm-2.6.31-Sep28.orig/mm/memcontrol.c
+++ mmotm-2.6.31-Sep28/mm/memcontrol.c
@@ -1977,8 +1977,14 @@ __do_uncharge(struct mem_cgroup *mem, co
return;
direct_uncharge:
res_counter_uncharge(&mem->res, PAGE_SIZE);
- if (uncharge_memsw)
+ if (uncharge_memsw) {
res_counter_uncharge(&mem->memsw, PAGE_SIZE);
+ /*
+ * swapout-uncharge do css_put() by itself. then we do
+ * css_put() only in this case.
+ */
+ css_put(&mem->css);
+ }
return;
}
@@ -2048,9 +2054,6 @@ __mem_cgroup_uncharge_common(struct page
if (mem_cgroup_soft_limit_check(mem))
mem_cgroup_update_tree(mem, page);
- /* at swapout, this memcg will be accessed to record to swap */
- if (ctype != MEM_CGROUP_CHARGE_TYPE_SWAPOUT)
- css_put(&mem->css);
return mem;
@@ -2108,8 +2111,11 @@ void mem_cgroup_uncharge_end(void)
if (!mem)
return;
/* This "mem" is valid bacause we hide charges behind us. */
- if (current->memcg_batch.pages)
+ if (current->memcg_batch.pages) {
res_counter_uncharge(&mem->res, current->memcg_batch.pages);
+ bulk_css_put(&mem->css,
+ current->memcg_batch.pages >> PAGE_SHIFT);
+ }
if (current->memcg_batch.memsw)
res_counter_uncharge(&mem->memsw, current->memcg_batch.memsw);
/* Not necessary. but forget this pointer */
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-10-05 7:20 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-10-02 4:55 KAMEZAWA Hiroyuki
2009-10-02 5:01 ` [PATCH 1/2] memcg: coalescing uncharge at unmap and truncation KAMEZAWA Hiroyuki
2009-10-02 6:47 ` Hiroshi Shimamoto
2009-10-02 6:53 ` Hiroshi Shimamoto
2009-10-02 7:04 ` KAMEZAWA Hiroyuki
2009-10-02 7:02 ` [PATCH 1/2] memcg: coalescing uncharge at unmap and truncation (fixed coimpile bug) KAMEZAWA Hiroyuki
2009-10-08 22:17 ` Andrew Morton
2009-10-08 23:48 ` KAMEZAWA Hiroyuki
2009-10-09 4:01 ` [PATCH 1/2] memcg: coalescing uncharge at unmap and truncation Balbir Singh
2009-10-09 4:17 ` KAMEZAWA Hiroyuki
2009-10-02 5:03 ` [PATCH 2/2] memcg: coalescing charges per cpu KAMEZAWA Hiroyuki
2009-10-08 22:26 ` Andrew Morton
2009-10-08 23:54 ` KAMEZAWA Hiroyuki
2009-10-09 4:15 ` Balbir Singh
2009-10-09 4:25 ` KAMEZAWA Hiroyuki
2009-10-02 8:53 ` [PATCH 0/2] memcg: improving scalability by reducing lock contention at charge/uncharge KAMEZAWA Hiroyuki
2009-10-05 7:18 ` KAMEZAWA Hiroyuki [this message]
2009-10-05 10:37 ` Balbir Singh
[not found] ` <604427e00910091737s52e11ce9p256c95d533dc2837@mail.gmail.com>
2009-10-11 2:33 ` KAMEZAWA Hiroyuki
[not found] ` <604427e00910111134o6f22f0ddg2b87124dd334ec02@mail.gmail.com>
2009-10-12 11:38 ` Balbir Singh
2009-10-13 0:29 ` KAMEZAWA Hiroyuki
[not found] ` <604427e00910121818w71dd4b7dl8781d7f5bc4f7dd9@mail.gmail.com>
2009-10-13 1:28 ` KAMEZAWA Hiroyuki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20091005161808.fa5ab0c6.kamezawa.hiroyu@jp.fujitsu.com \
--to=kamezawa.hiroyu@jp.fujitsu.com \
--cc=akpm@linux-foundation.org \
--cc=balbir@linux.vnet.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=nishimura@mxp.nes.nec.co.jp \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox