From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
To: Ying Han <yinghan@google.com>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>,
Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>,
Li Zefan <lizf@cn.fujitsu.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Rik van Riel <riel@redhat.com>,
Andrea Arcangeli <aarcange@redhat.com>,
Greg Thelen <gthelen@google.com>,
Minchan Kim <minchan.kim@gmail.com>, Mel Gorman <mel@csn.ul.ie>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Michal Hocko <mhocko@suse.cz>, Zhu Yanhai <zhu.yanhai@gmail.com>,
linux-mm@kvack.org
Subject: Re: [RFC][PATCH] memcg: isolate pages in memcg lru from global lru
Date: Thu, 31 Mar 2011 11:25:32 +0900 [thread overview]
Message-ID: <20110331112532.82ed25ad.kamezawa.hiroyu@jp.fujitsu.com> (raw)
In-Reply-To: <1301532498-20309-1-git-send-email-yinghan@google.com>
On Wed, 30 Mar 2011 17:48:18 -0700
Ying Han <yinghan@google.com> wrote:
> In memory controller, we do both targeting reclaim and global reclaim. The
> later one walks through the global lru which links all the allocated pages
> on the system. It breaks the memory isolation since pages are evicted
> regardless of their memcg owners. This patch takes pages off global lru
> as long as they are added to per-memcg lru.
>
> Memcg and cgroup together provide the solution of memory isolation where
> multiple cgroups run in parallel without interfering with each other. In
> vm, memory isolation requires changes in both page allocation and page
> reclaim. The current memcg provides good user page accounting, but need
> more work on the page reclaim.
>
> In an over-committed machine w/ 32G ram, here is the configuration:
>
> cgroup-A/ -- limit_in_bytes = 20G, soft_limit_in_bytes = 15G
> cgroup-B/ -- limit_in_bytes = 20G, soft_limit_in_bytes = 15G
>
> 1) limit_in_bytes is the hard_limit where process will be throttled or OOM
> killed by going over the limit.
> 2) memory between soft_limit and limit_in_bytes are best-effort. soft_limit
> provides "guarantee" in some sense.
>
> Then, it is easy to generate the following senario where:
>
> cgroup-A/ -- usage_in_bytes = 20G
> cgroup-B/ -- usage_in_bytes = 12G
>
> The global memory pressure triggers while cgroup-A keep allocating memory. At
> this point, pages belongs to cgroup-B can be evicted from global LRU.
>
> We do have per-memcg targeting reclaim including per-memcg background reclaim
> and soft_limit reclaim. Both of them need some improvement, and regardless we
> still need this patch since it breaks isolation.
>
> Besides, here is to-do list I have on memcg page reclaim and they are sorted.
> a) per-memcg background reclaim. to reclaim pages proactively
agree,
> b) skipping global lru reclaim if soft_limit reclaim does enough work. this is
> both for global background reclaim and global ttfp reclaim.
agree. but zone-balancing cannot be avoidalble for now. So, I think we need a
inter-zone-page-migration to balancing memory between zones...if necessary.
> c) improve the soft_limit reclaim to be efficient.
must be done.
> d) isolate pages in memcg from global list since it breaks memory isolation.
>
I never agree this until about a),b),c) is fixed and we can go nowhere.
BTW, in other POV, for reducing size of page_cgroup, we must remove ->lru
on page_cgroup. If divide-and-conquer memory reclaim works enough,
we can do that. But this is a big global VM change, so we need enough
justification.
> I have some basic test on this patch and more tests definitely are needed:
>
> Functional:
> two memcgs under root. cgroup-A is reading 20g file with 2g limit,
> cgroup-B is running random stuff with 500m limit. Check the counters for
> per-memcg lru and global lru, and they should add-up.
>
> 1) total file pages
> $ cat /proc/meminfo | grep Cache
> Cached: 6032128 kB
>
> 2) file lru on global lru
> $ cat /proc/vmstat | grep file
> nr_inactive_file 0
> nr_active_file 963131
>
> 3) file lru on root cgroup
> $ cat /dev/cgroup/memory.stat | grep file
> inactive_file 0
> active_file 0
>
> 4) file lru on cgroup-A
> $ cat /dev/cgroup/A/memory.stat | grep file
> inactive_file 2145759232
> active_file 0
>
> 5) file lru on cgroup-B
> $ cat /dev/cgroup/B/memory.stat | grep file
> inactive_file 401408
> active_file 143360
>
> Performance:
> run page fault test(pft) with 16 thread on faulting in 15G anon pages
> in 16G cgroup. There is no regression noticed on "flt/cpu/s"
>
You need a fix for /proc/meminfo, /proc/vmstat to count memcg's ;)
Anyway, this seems too aggresive to me, for now. Please do a), b), c), at first.
IIUC, this patch itself can cause a livelock when softlimit is misconfigured.
What is the protection against wrong softlimit ?
If we do this kind of LRU isolation, we'll need some limitation of the sum of
limits of all memcg for avoiding wrong configuration. That may change UI, dramatically.
(As RT-class cpu limiting cgroup does.....)
Anyway, thank you for data.
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2011-03-31 2:32 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-03-31 0:48 Ying Han
2011-03-31 2:01 ` Daisuke Nishimura
2011-03-31 4:52 ` Ying Han
2011-03-31 2:25 ` KAMEZAWA Hiroyuki [this message]
2011-03-31 5:41 ` Ying Han
2011-03-31 6:07 ` KAMEZAWA Hiroyuki
2011-03-31 13:20 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110331112532.82ed25ad.kamezawa.hiroyu@jp.fujitsu.com \
--to=kamezawa.hiroyu@jp.fujitsu.com \
--cc=aarcange@redhat.com \
--cc=balbir@linux.vnet.ibm.com \
--cc=gthelen@google.com \
--cc=hannes@cmpxchg.org \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-mm@kvack.org \
--cc=lizf@cn.fujitsu.com \
--cc=mel@csn.ul.ie \
--cc=mhocko@suse.cz \
--cc=minchan.kim@gmail.com \
--cc=nishimura@mxp.nes.nec.co.jp \
--cc=riel@redhat.com \
--cc=yinghan@google.com \
--cc=zhu.yanhai@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox