From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
To: David Rientjes <rientjes@google.com>
Cc: Ying Han <yinghan@google.com>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Minchan Kim <minchan.kim@gmail.com>,
Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>,
Balbir Singh <balbir@linux.vnet.ibm.com>,
Tejun Heo <tj@kernel.org>,
Mark Brown <broonie@opensource.wolfsonmicro.com>,
Andrew Morton <akpm@linux-foundation.org>,
linux-mm@kvack.org
Subject: Re: [PATCH V3] Add the pagefault count into memcg stats
Date: Thu, 14 Apr 2011 08:52:39 +0900 [thread overview]
Message-ID: <20110414085239.a597fb5c.kamezawa.hiroyu@jp.fujitsu.com> (raw)
In-Reply-To: <alpine.DEB.2.00.1104131301180.8140@chino.kir.corp.google.com>
On Wed, 13 Apr 2011 13:12:33 -0700 (PDT)
David Rientjes <rientjes@google.com> wrote:
> On Tue, 29 Mar 2011, Ying Han wrote:
>
> > Two new stats in per-memcg memory.stat which tracks the number of
> > page faults and number of major page faults.
> >
> > "pgfault"
> > "pgmajfault"
> >
> > They are different from "pgpgin"/"pgpgout" stat which count number of
> > pages charged/discharged to the cgroup and have no meaning of reading/
> > writing page to disk.
> >
> > It is valuable to track the two stats for both measuring application's
> > performance as well as the efficiency of the kernel page reclaim path.
> > Counting pagefaults per process is useful, but we also need the aggregated
> > value since processes are monitored and controlled in cgroup basis in memcg.
> >
> > Functional test: check the total number of pgfault/pgmajfault of all
> > memcgs and compare with global vmstat value:
> >
> > $ cat /proc/vmstat | grep fault
> > pgfault 1070751
> > pgmajfault 553
> >
> > $ cat /dev/cgroup/memory.stat | grep fault
> > pgfault 1071138
> > pgmajfault 553
> > total_pgfault 1071142
> > total_pgmajfault 553
> >
> > $ cat /dev/cgroup/A/memory.stat | grep fault
> > pgfault 199
> > pgmajfault 0
> > total_pgfault 199
> > total_pgmajfault 0
> >
> > Performance test: run page fault test(pft) wit 16 thread on faulting in 15G
> > anon pages in 16G container. There is no regression noticed on the "flt/cpu/s"
> >
> > Sample output from pft:
> > TAG pft:anon-sys-default:
> > Gb Thr CLine User System Wall flt/cpu/s fault/wsec
> > 15 16 1 0.67s 233.41s 14.76s 16798.546 266356.260
> >
> > +-------------------------------------------------------------------------+
> > N Min Max Median Avg Stddev
> > x 10 16682.962 17344.027 16913.524 16928.812 166.5362
> > + 10 16695.568 16923.896 16820.604 16824.652 84.816568
> > No difference proven at 95.0% confidence
> >
> > Change v3..v2
> > 1. removed the unnecessary function definition in memcontrol.h
> >
> > Signed-off-by: Ying Han <yinghan@google.com>
> > Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
>
> I'm wondering if we can just modify count_vm_event() directly for
> CONFIG_CGROUP_MEM_RES_CTLR so that we automatically track all vmstat items
> (those in enum vm_event_item) for each memcg. We could add an array of
> NR_VM_EVENT_ITEMS into each struct mem_cgroup to be incremented on
> count_vm_event() for current's memcg.
>
> If that's done, we wouldn't have to add additional calls for every vmstat
> item we want to duplicate from the global counters.
>
Maybe we do that finally.
For now, IIUC, over 50% of VM_EVENTS are needless for memcg (ex. per zone stats)
and this array consumes large size of percpu area. I think we need to select
events carefully even if we do that. And current memcg's percpu stat is mixture
of vm_events and vm_stat. We may need to sort out them and re-design it.
My concern is that I'm not sure we have enough percpu area for vmstat+vmevents
for 1000+ memcg, and it's allowed even if we can do.
But yes, it seems worth to consider.
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2011-04-13 23:59 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-03-29 17:32 Ying Han
2011-03-30 1:16 ` KOSAKI Motohiro
2011-03-30 1:37 ` Ying Han
2011-03-30 1:54 ` KOSAKI Motohiro
2011-03-31 20:01 ` Andrew Morton
2011-04-13 20:12 ` David Rientjes
2011-04-13 23:52 ` KAMEZAWA Hiroyuki [this message]
2011-04-14 0:47 ` David Rientjes
2011-04-14 1:18 ` KAMEZAWA Hiroyuki
-- strict thread matches above, loose matches on Subject: below --
2011-03-29 6:16 Ying Han
2011-03-29 21:36 ` Minchan Kim
2011-03-30 2:47 ` Balbir Singh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110414085239.a597fb5c.kamezawa.hiroyu@jp.fujitsu.com \
--to=kamezawa.hiroyu@jp.fujitsu.com \
--cc=akpm@linux-foundation.org \
--cc=balbir@linux.vnet.ibm.com \
--cc=broonie@opensource.wolfsonmicro.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-mm@kvack.org \
--cc=minchan.kim@gmail.com \
--cc=nishimura@mxp.nes.nec.co.jp \
--cc=rientjes@google.com \
--cc=tj@kernel.org \
--cc=yinghan@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox