Re: [PATCH V3] Add the pagefault count into memcg stats

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
To: David Rientjes <rientjes@google.com>
Cc: Ying Han <yinghan@google.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Minchan Kim <minchan.kim@gmail.com>,
	Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>,
	Balbir Singh <balbir@linux.vnet.ibm.com>,
	Tejun Heo <tj@kernel.org>,
	Mark Brown <broonie@opensource.wolfsonmicro.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org
Subject: Re: [PATCH V3] Add the pagefault count into memcg stats
Date: Thu, 14 Apr 2011 08:52:39 +0900	[thread overview]
Message-ID: <20110414085239.a597fb5c.kamezawa.hiroyu@jp.fujitsu.com> (raw)
In-Reply-To: <alpine.DEB.2.00.1104131301180.8140@chino.kir.corp.google.com>

On Wed, 13 Apr 2011 13:12:33 -0700 (PDT)
David Rientjes <rientjes@google.com> wrote:

> On Tue, 29 Mar 2011, Ying Han wrote:
> 
> > Two new stats in per-memcg memory.stat which tracks the number of
> > page faults and number of major page faults.
> > 
> > "pgfault"
> > "pgmajfault"
> > 
> > They are different from "pgpgin"/"pgpgout" stat which count number of
> > pages charged/discharged to the cgroup and have no meaning of reading/
> > writing page to disk.
> > 
> > It is valuable to track the two stats for both measuring application's
> > performance as well as the efficiency of the kernel page reclaim path.
> > Counting pagefaults per process is useful, but we also need the aggregated
> > value since processes are monitored and controlled in cgroup basis in memcg.
> > 
> > Functional test: check the total number of pgfault/pgmajfault of all
> > memcgs and compare with global vmstat value:
> > 
> > $ cat /proc/vmstat | grep fault
> > pgfault 1070751
> > pgmajfault 553
> > 
> > $ cat /dev/cgroup/memory.stat | grep fault
> > pgfault 1071138
> > pgmajfault 553
> > total_pgfault 1071142
> > total_pgmajfault 553
> > 
> > $ cat /dev/cgroup/A/memory.stat | grep fault
> > pgfault 199
> > pgmajfault 0
> > total_pgfault 199
> > total_pgmajfault 0
> > 
> > Performance test: run page fault test(pft) wit 16 thread on faulting in 15G
> > anon pages in 16G container. There is no regression noticed on the "flt/cpu/s"
> > 
> > Sample output from pft:
> > TAG pft:anon-sys-default:
> >   Gb  Thr CLine   User     System     Wall    flt/cpu/s fault/wsec
> >   15   16   1     0.67s   233.41s    14.76s   16798.546 266356.260
> > 
> > +-------------------------------------------------------------------------+
> >     N           Min           Max        Median           Avg        Stddev
> > x  10     16682.962     17344.027     16913.524     16928.812      166.5362
> > +  10     16695.568     16923.896     16820.604     16824.652     84.816568
> > No difference proven at 95.0% confidence
> > 
> > Change v3..v2
> > 1. removed the unnecessary function definition in memcontrol.h
> > 
> > Signed-off-by: Ying Han <yinghan@google.com>
> > Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> 
> I'm wondering if we can just modify count_vm_event() directly for 
> CONFIG_CGROUP_MEM_RES_CTLR so that we automatically track all vmstat items 
> (those in enum vm_event_item) for each memcg.  We could add an array of 
> NR_VM_EVENT_ITEMS into each struct mem_cgroup to be incremented on 
> count_vm_event() for current's memcg.
> 
> If that's done, we wouldn't have to add additional calls for every vmstat 
> item we want to duplicate from the global counters.
> 

Maybe we do that finally.

For now, IIUC, over 50% of VM_EVENTS are needless for memcg (ex. per zone stats)
and this array consumes large size of percpu area. I think we need to select
events carefully even if we do that. And current memcg's percpu stat is mixture
of vm_events and vm_stat. We may need to sort out them and re-design it.
My concern is that I'm not sure we have enough percpu area for vmstat+vmevents
for 1000+ memcg, and it's allowed even if we can do.

But yes, it seems worth to consider.

Thanks,
-Kame


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2011-04-13 23:59 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-03-29 17:32 Ying Han
2011-03-30  1:16 ` KOSAKI Motohiro
2011-03-30  1:37   ` Ying Han
2011-03-30  1:54     ` KOSAKI Motohiro
2011-03-31 20:01 ` Andrew Morton
2011-04-13 20:12 ` David Rientjes
2011-04-13 23:52   ` KAMEZAWA Hiroyuki [this message]
2011-04-14  0:47     ` David Rientjes
2011-04-14  1:18       ` KAMEZAWA Hiroyuki
  -- strict thread matches above, loose matches on Subject: below --
2011-03-29  6:16 Ying Han
2011-03-29 21:36 ` Minchan Kim
2011-03-30  2:47 ` Balbir Singh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110414085239.a597fb5c.kamezawa.hiroyu@jp.fujitsu.com \
    --to=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=akpm@linux-foundation.org \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=broonie@opensource.wolfsonmicro.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-mm@kvack.org \
    --cc=minchan.kim@gmail.com \
    --cc=nishimura@mxp.nes.nec.co.jp \
    --cc=rientjes@google.com \
    --cc=tj@kernel.org \
    --cc=yinghan@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox