Sorry forgot to post the script i capture the result: echo $$ >/dev/cgroup/memory/A/tasks time cat /export/hdc3/dd_A/tf0 > /dev/zero & sleep 10 echo $$ >/dev/cgroup/memory/tasks ( while /root/getdelays -dip `pidof cat`; do sleep 10; done ) --Ying On Fri, May 13, 2011 at 5:25 PM, Ying Han wrote: > Here I ran some tests and the result. > > On a 32G machine, I created a memcg with 4G hard_limit (limit_in_bytes) > and and ran cat on a 20g file. Then I use getdelays to measure the > ttfp "delay average" under RECLAIM. When the workload is reaching its > hard_limit and > without background reclaim, each ttfp is triggered by a pagefault. I would > like to demostrate the average delay average for ttfp (thus page fault > latency) on the streaming read/write workload and compare it w/ per-memcg bg > reclaim enabled. > > Note: > 1. I applied a patch on getdelays.c from fengguang which shows > average CPU/IO/SWAP/RECLAIM delays in ns. > > 2. I used my latest version of per-memcg-per-kswapd patch for the > following test. The patch could have been improved since then and I can run > the same test when Kame has his patch ready. > > Configuration: > $ cat /proc/meminfo > MemTotal: 33045832 kB > > $ cat /dev/cgroup/memory/A/memory.limit_in_bytes > 4294967296 > > $ cat /dev/cgroup/memory/A/memory.reclaim_wmarks > low_wmark 4137680896 > high_wmark 4085252096 > > Test: > $ echo $$ >/dev/cgroup/memory/A/tasks > $ cat /export/hdc3/dd_A/tf0 > /dev/zero > > Without per-memcg background reclaim: > > CPU count real total virtual total delay total delay > average > 176589 17248377848 27344548685 1093693318 > 6193.440ns > IO count delay total delay average > 160704 242072632962 1506326ns > SWAP count delay total delay average > 0 0 0ns > RECLAIM count delay total delay average > 15944 3512140153 220279ns > cat: read=20947877888, write=0, cancelled_write=0 > > real>---4m26.912s > user>---0m0.227s > sys>----0m27.823s > > With per-memcg background reclaim: > > $ ps -ef | grep memcg > root 5803 2 2 13:56 ? 00:04:20 [memcg_4] > > CPU count real total virtual total delay total delay > average > 161085 13185995424 23863858944 72902585 > 452.572ns > IO count delay total delay average > 160915 246145533109 1529661ns > SWAP count delay total delay average > 0 0 0ns > RECLAIM count delay total delay average > 0 0 0ns > cat: read=20974891008, write=0, cancelled_write=0 > > real>---4m26.572s > user>---0m0.246s > sys>----0m24.192s > > memcg_4 cputime: 2.86sec > > Observation: > 1. Without the background reclaim, the cat hit ttfp heavely and the "delay > average" goes above 220 microsec. > > 2. With background reclaim, the ttfp delay average is always 0. Since the > ttfp happens synchronously and that implies the latency of the application > overtime. > > 3. The real time goes slighly better w/ bg reclaim and the sys time is > about the same ( adding the memcg_4 time on top of sys time of cat). But i > don't expect big cpu benefit. The async reclaim uses spare cputime to > proactivly reclaim pages on the side which gurantees less latency variation > of application over time. > > --Ying > > On Thu, May 12, 2011 at 10:10 PM, Ying Han wrote: > >> >> >> On Thu, May 12, 2011 at 8:03 PM, KAMEZAWA Hiroyuki < >> kamezawa.hiroyu@jp.fujitsu.com> wrote: >> >>> On Thu, 12 May 2011 17:17:25 +0900 >>> KAMEZAWA Hiroyuki wrote: >>> >>> > On Thu, 12 May 2011 13:22:37 +0900 >>> > KAMEZAWA Hiroyuki wrote: >>> > I'll check what codes in vmscan.c or /mm affects memcg and post a >>> > required fix in step by step. I think I found some.. >>> > >>> >>> After some tests, I doubt that 'automatic' one is unnecessary until >>> memcg's dirty_ratio is supported. And as Andrew pointed out, >>> total cpu consumption is unchanged and I don't have workloads which >>> shows me meaningful speed up. >>> >> >> The total cpu consumption is one way to measure the background reclaim, >> another thing I would like to measure is a histogram of page fault latency >> for a heavy page allocation application. I would expect with background >> reclaim, we will get less variation on the page fault latency than w/o it. >> >> Sorry i haven't got chance to run some tests to back it up. I will try to >> get some data. >> >> >>> But I guess...with dirty_ratio, amount of dirty pages in memcg is >>> limited and background reclaim can work enough without noise of >>> write_page() while applications are throttled by dirty_ratio. >>> >> >> Definitely. I have run into the issue while debugging the soft_limit >> reclaim. The background reclaim became very inefficient if we have dirty >> pages greater than the soft_limit. Talking w/ Greg about it regarding his >> per-memcg dirty page limit effort, we should consider setting the dirty >> ratio which not allowing the dirty pages greater the reclaim watermarks >> (here is the soft_limit). >> >> --Ying >> >> >>> Hmm, I'll study for a while but it seems better to start active soft >>> limit, >>> (or some threshold users can set) first. >>> >>> Anyway, this work makes me to see vmscan.c carefully and I think I can >>> post some patches for fix, tunes. >>> >>> Thanks, >>> -Kame >>> >>> >> >