* [Experimental] [PATCH 0/5] page_cgroup->flags diet.
@ 2011-12-15 6:00 KAMEZAWA Hiroyuki
2011-12-15 6:05 ` [RFC] [PATCH 1/5] memcg: simplify account moving check KAMEZAWA Hiroyuki
` (5 more replies)
0 siblings, 6 replies; 13+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-12-15 6:00 UTC (permalink / raw)
To: linux-mm
Cc: akpm, hannes, Michal Hocko, Balbir Singh, Hugh Dickins, Ying Han,
nishimura
For reducing size of page_cgroup, we need to reduce flags.
This patch is a trial to remove flags based on
linux-next +
memcg-add-mem_cgroup_replace_page_cache-to-fix-lru-issue.patch +
4 patches for 'simplify LRU handling' I posted.
So, I don't ask anyone to test this but want to hear another idea
or comments on implemenation.
After this patch series, page_cgroup flags are
enum {
/* flags for mem_cgroup */
PCG_LOCK, /* Lock for pc->mem_cgroup and following bits. */
PCG_USED, /* this object is in use. */
PCG_MIGRATION, /* under page migration */
__NR_PCG_FLAGS,
};
3bits. I thought I could remove PCG_MIGRATION ....but failed.
BTW, because of kmalloc()'s alignment, low 3bits of pc->mem_cgroup must be 0.
So, we can move these flags to low bits in pc->mem_cgroup...I guess.
(But in another thoughts, we need to track blkio per page finally. So, I'm
not sure whether we can remove page_cgroup at the end.)
Anyway, I dump a current trial series. Please comment when you have time.
I'm not in hurry and will not be able to make a quick response.
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 13+ messages in thread* [RFC] [PATCH 1/5] memcg: simplify account moving check 2011-12-15 6:00 [Experimental] [PATCH 0/5] page_cgroup->flags diet KAMEZAWA Hiroyuki @ 2011-12-15 6:05 ` KAMEZAWA Hiroyuki 2011-12-15 10:04 ` Johannes Weiner 2011-12-15 6:06 ` [RFC][PATCH 2/5] memcg: safer page stat updating KAMEZAWA Hiroyuki ` (4 subsequent siblings) 5 siblings, 1 reply; 13+ messages in thread From: KAMEZAWA Hiroyuki @ 2011-12-15 6:05 UTC (permalink / raw) To: KAMEZAWA Hiroyuki Cc: linux-mm, akpm, hannes, Michal Hocko, Balbir Singh, Hugh Dickins, Ying Han, nishimura ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC] [PATCH 1/5] memcg: simplify account moving check 2011-12-15 6:05 ` [RFC] [PATCH 1/5] memcg: simplify account moving check KAMEZAWA Hiroyuki @ 2011-12-15 10:04 ` Johannes Weiner 2011-12-15 10:17 ` KAMEZAWA Hiroyuki 0 siblings, 1 reply; 13+ messages in thread From: Johannes Weiner @ 2011-12-15 10:04 UTC (permalink / raw) To: KAMEZAWA Hiroyuki Cc: linux-mm, akpm, Michal Hocko, Balbir Singh, Hugh Dickins, Ying Han, nishimura On Thu, Dec 15, 2011 at 03:05:22PM +0900, KAMEZAWA Hiroyuki wrote: > >From 528f5f2667da17c26e40d271b24691412e1cbe81 Mon Sep 17 00:00:00 2001 > From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> > Date: Thu, 15 Dec 2011 11:41:18 +0900 > Subject: [PATCH 1/5] memcg: simplify account moving check > > Now, percpu variable MEM_CGROUP_ON_MOVE is used for indicating that > a memcg is under move_account() and pc->mem_cgroup under it may be > overwritten. > > But this value is almost read only and not worth to be percpu. > Using atomic_t instread. I like this, but I think you can go one further. The only place I see where the per-cpu counter is actually read is to avoid taking the lock, but if you make that counter an atomic anyway - why bother? Couldn't you remove the counter completely and just take move_lock unconditionally in the page stat updating? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC] [PATCH 1/5] memcg: simplify account moving check 2011-12-15 10:04 ` Johannes Weiner @ 2011-12-15 10:17 ` KAMEZAWA Hiroyuki 0 siblings, 0 replies; 13+ messages in thread From: KAMEZAWA Hiroyuki @ 2011-12-15 10:17 UTC (permalink / raw) To: Johannes Weiner Cc: linux-mm, akpm, Michal Hocko, Balbir Singh, Hugh Dickins, Ying Han, nishimura On Thu, 15 Dec 2011 11:04:43 +0100 Johannes Weiner <hannes@cmpxchg.org> wrote: > On Thu, Dec 15, 2011 at 03:05:22PM +0900, KAMEZAWA Hiroyuki wrote: > > >From 528f5f2667da17c26e40d271b24691412e1cbe81 Mon Sep 17 00:00:00 2001 > > From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> > > Date: Thu, 15 Dec 2011 11:41:18 +0900 > > Subject: [PATCH 1/5] memcg: simplify account moving check > > > > Now, percpu variable MEM_CGROUP_ON_MOVE is used for indicating that > > a memcg is under move_account() and pc->mem_cgroup under it may be > > overwritten. > > > > But this value is almost read only and not worth to be percpu. > > Using atomic_t instread. > > I like this, but I think you can go one further. The only place I see > where the per-cpu counter is actually read is to avoid taking the > lock, but if you make that counter an atomic anyway - why bother? > > Couldn't you remove the counter completely and just take move_lock > unconditionally in the page stat updating? > Hmm, I (and Greg Thelen) warned that 'please _never_ add atomic ops to this path' by Peter Zilstra. That 'moving_account' condition checking was for avoiding atomic ops in this path (for most of cases.) We'll need to gather enough performance data after implementing per-memcg dirty ratio accounting. So, could you wait for a while ? Anyway, my patch '[PATCH 5/5] memcg: remove PCG_MOVE_LOCK' may not work enough well without the moving_account check. I'll consider more. Thank you for review. -Kame -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
* [RFC][PATCH 2/5] memcg: safer page stat updating 2011-12-15 6:00 [Experimental] [PATCH 0/5] page_cgroup->flags diet KAMEZAWA Hiroyuki 2011-12-15 6:05 ` [RFC] [PATCH 1/5] memcg: simplify account moving check KAMEZAWA Hiroyuki @ 2011-12-15 6:06 ` KAMEZAWA Hiroyuki 2011-12-15 6:07 ` [RFC][PATCH 3/5] memcg: remove PCG_FILE_MAPPED KAMEZAWA Hiroyuki ` (3 subsequent siblings) 5 siblings, 0 replies; 13+ messages in thread From: KAMEZAWA Hiroyuki @ 2011-12-15 6:06 UTC (permalink / raw) To: KAMEZAWA Hiroyuki Cc: linux-mm, akpm, hannes, Michal Hocko, Balbir Singh, Hugh Dickins, Ying Han, nishimura better (lockless) idea is welcomed. ^ permalink raw reply [flat|nested] 13+ messages in thread
* [RFC][PATCH 3/5] memcg: remove PCG_FILE_MAPPED 2011-12-15 6:00 [Experimental] [PATCH 0/5] page_cgroup->flags diet KAMEZAWA Hiroyuki 2011-12-15 6:05 ` [RFC] [PATCH 1/5] memcg: simplify account moving check KAMEZAWA Hiroyuki 2011-12-15 6:06 ` [RFC][PATCH 2/5] memcg: safer page stat updating KAMEZAWA Hiroyuki @ 2011-12-15 6:07 ` KAMEZAWA Hiroyuki 2011-12-15 6:08 ` [RFC][PATCH 4/5] memcg: remove PCG_CACHE bit KAMEZAWA Hiroyuki ` (2 subsequent siblings) 5 siblings, 0 replies; 13+ messages in thread From: KAMEZAWA Hiroyuki @ 2011-12-15 6:07 UTC (permalink / raw) To: KAMEZAWA Hiroyuki Cc: linux-mm, akpm, hannes, Michal Hocko, Balbir Singh, Hugh Dickins, Ying Han, nishimura ^ permalink raw reply [flat|nested] 13+ messages in thread
* [RFC][PATCH 4/5] memcg: remove PCG_CACHE bit 2011-12-15 6:00 [Experimental] [PATCH 0/5] page_cgroup->flags diet KAMEZAWA Hiroyuki ` (2 preceding siblings ...) 2011-12-15 6:07 ` [RFC][PATCH 3/5] memcg: remove PCG_FILE_MAPPED KAMEZAWA Hiroyuki @ 2011-12-15 6:08 ` KAMEZAWA Hiroyuki 2011-12-15 10:24 ` Johannes Weiner 2011-12-15 6:10 ` [RFC] [PATCH 5/5] memcg: remove PCG_MOVE_LOCK KAMEZAWA Hiroyuki 2011-12-15 8:44 ` [Experimental] [PATCH 0/5] page_cgroup->flags diet KAMEZAWA Hiroyuki 5 siblings, 1 reply; 13+ messages in thread From: KAMEZAWA Hiroyuki @ 2011-12-15 6:08 UTC (permalink / raw) To: KAMEZAWA Hiroyuki Cc: linux-mm, akpm, hannes, Michal Hocko, Balbir Singh, Hugh Dickins, Ying Han, nishimura ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC][PATCH 4/5] memcg: remove PCG_CACHE bit 2011-12-15 6:08 ` [RFC][PATCH 4/5] memcg: remove PCG_CACHE bit KAMEZAWA Hiroyuki @ 2011-12-15 10:24 ` Johannes Weiner 2011-12-15 10:36 ` KAMEZAWA Hiroyuki 0 siblings, 1 reply; 13+ messages in thread From: Johannes Weiner @ 2011-12-15 10:24 UTC (permalink / raw) To: KAMEZAWA Hiroyuki Cc: linux-mm, akpm, Michal Hocko, Balbir Singh, Hugh Dickins, Ying Han, nishimura On Thu, Dec 15, 2011 at 03:08:22PM +0900, KAMEZAWA Hiroyuki wrote: > >From 577b441ae259728d83a99baba11bf4925b4542d4 Mon Sep 17 00:00:00 2001 > From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> > Date: Thu, 15 Dec 2011 12:09:03 +0900 > Subject: [PATCH 4/5] memcg: remove PCG_CACHE bit. > > This bit can be replaced by PageAnon(page) check. You need to make sure that only fully rmapped pages are charged/uncharged, which is not the case yet. E.g. look at the newpage_charge() in mm/memory.c::do_anonymous_page() - if the page table is updated by someone else, that uncharge_page() in the error path will see !PageAnon() because page->mapping is not set yet and uncharge it as cache. What I think is required is to break up the charging and committing like we do for swap cache already: if (!mem_cgroup_try_charge()) goto error; page_add_new_anon_rmap() mem_cgroup_commit() This will also allow us to even get rid of passing around the charge type everywhere... -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC][PATCH 4/5] memcg: remove PCG_CACHE bit 2011-12-15 10:24 ` Johannes Weiner @ 2011-12-15 10:36 ` KAMEZAWA Hiroyuki 2011-12-15 12:04 ` KAMEZAWA Hiroyuki 0 siblings, 1 reply; 13+ messages in thread From: KAMEZAWA Hiroyuki @ 2011-12-15 10:36 UTC (permalink / raw) To: Johannes Weiner Cc: linux-mm, akpm, Michal Hocko, Balbir Singh, Hugh Dickins, Ying Han, nishimura On Thu, 15 Dec 2011 11:24:42 +0100 Johannes Weiner <hannes@cmpxchg.org> wrote: > On Thu, Dec 15, 2011 at 03:08:22PM +0900, KAMEZAWA Hiroyuki wrote: > > >From 577b441ae259728d83a99baba11bf4925b4542d4 Mon Sep 17 00:00:00 2001 > > From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> > > Date: Thu, 15 Dec 2011 12:09:03 +0900 > > Subject: [PATCH 4/5] memcg: remove PCG_CACHE bit. > > > > This bit can be replaced by PageAnon(page) check. > > You need to make sure that only fully rmapped pages are > charged/uncharged, which is not the case yet. E.g. look at the > newpage_charge() in mm/memory.c::do_anonymous_page() - if the page > table is updated by someone else, that uncharge_page() in the error > path will see !PageAnon() because page->mapping is not set yet and > uncharge it as cache. > Ah, nice catch. I missed the error path. > What I think is required is to break up the charging and committing > like we do for swap cache already: > > if (!mem_cgroup_try_charge()) > goto error; > page_add_new_anon_rmap() > mem_cgroup_commit() > > This will also allow us to even get rid of passing around the charge > type everywhere... > Thank you. I'll look into. To be honest, I want to remove 'rss' and 'cache' counter ;( This doesn't have much meanings after lru was splitted. Thanks, -Kame -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC][PATCH 4/5] memcg: remove PCG_CACHE bit 2011-12-15 10:36 ` KAMEZAWA Hiroyuki @ 2011-12-15 12:04 ` KAMEZAWA Hiroyuki 2011-12-15 14:54 ` Johannes Weiner 0 siblings, 1 reply; 13+ messages in thread From: KAMEZAWA Hiroyuki @ 2011-12-15 12:04 UTC (permalink / raw) To: KAMEZAWA Hiroyuki Cc: Johannes Weiner, linux-mm, akpm, Michal Hocko, Balbir Singh, Hugh Dickins, Ying Han, nishimura On Thu, 15 Dec 2011 19:36:31 +0900 KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote: > On Thu, 15 Dec 2011 11:24:42 +0100 > Johannes Weiner <hannes@cmpxchg.org> wrote: > > What I think is required is to break up the charging and committing > > like we do for swap cache already: > > > > if (!mem_cgroup_try_charge()) > > goto error; > > page_add_new_anon_rmap() > > mem_cgroup_commit() > > > > This will also allow us to even get rid of passing around the charge > > type everywhere... > > > > Thank you. I'll look into. > > To be honest, I want to remove 'rss' and 'cache' counter ;( > This doesn't have much meanings after lru was splitted. > I'll use this version for test. This patch is under far deep stacks of unmerged patches, anyway. == diff --git a/include/linux/page_cgroup.h b/include/linux/page_cgroup.h index e4cb1bf..86967ed 100644 --- a/include/linux/page_cgroup.h +++ b/include/linux/page_cgroup.h @@ -4,7 +4,6 @@ enum { /* flags for mem_cgroup */ PCG_LOCK, /* Lock for pc->mem_cgroup and following bits. */ - PCG_CACHE, /* charged as cache */ PCG_USED, /* this object is in use. */ PCG_MIGRATION, /* under page migration */ /* flags for mem_cgroup and file and I/O status */ @@ -63,11 +62,6 @@ static inline void ClearPageCgroup##uname(struct page_cgroup *pc) \ static inline int TestClearPageCgroup##uname(struct page_cgroup *pc) \ { return test_and_clear_bit(PCG_##lname, &pc->flags); } -/* Cache flag is set only once (at allocation) */ -TESTPCGFLAG(Cache, CACHE) -CLEARPCGFLAG(Cache, CACHE) -SETPCGFLAG(Cache, CACHE) - TESTPCGFLAG(Used, USED) CLEARPCGFLAG(Used, USED) SETPCGFLAG(Used, USED) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index fdcf454..89c76f1 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2368,6 +2368,8 @@ static void __mem_cgroup_commit_charge(struct mem_cgroup *memcg, struct page_cgroup *pc, enum charge_type ctype) { + bool file = false; + lock_page_cgroup(pc); if (unlikely(PageCgroupUsed(pc))) { unlock_page_cgroup(pc); @@ -2390,18 +2392,17 @@ static void __mem_cgroup_commit_charge(struct mem_cgroup *memcg, switch (ctype) { case MEM_CGROUP_CHARGE_TYPE_CACHE: case MEM_CGROUP_CHARGE_TYPE_SHMEM: - SetPageCgroupCache(pc); + file = true; SetPageCgroupUsed(pc); break; case MEM_CGROUP_CHARGE_TYPE_MAPPED: - ClearPageCgroupCache(pc); SetPageCgroupUsed(pc); break; default: break; } - mem_cgroup_charge_statistics(memcg, PageCgroupCache(pc), nr_pages); + mem_cgroup_charge_statistics(memcg, file, nr_pages); unlock_page_cgroup(pc); WARN_ON_ONCE(PageLRU(page)); /* @@ -2474,6 +2475,7 @@ static int mem_cgroup_move_account(struct page *page, bool uncharge) { unsigned long flags; + bool file = false; int ret; VM_BUG_ON(from == to); @@ -2503,14 +2505,17 @@ static int mem_cgroup_move_account(struct page *page, __this_cpu_inc(to->stat->count[MEM_CGROUP_STAT_FILE_MAPPED]); preempt_enable(); } - mem_cgroup_charge_statistics(from, PageCgroupCache(pc), -nr_pages); + /* Once PageAnon is set, it will not be cleared until freed. */ + if (!PageAnon(page)) + file = true; + mem_cgroup_charge_statistics(from, file, -nr_pages); if (uncharge) /* This is not "cancel", but cancel_charge does all we need. */ __mem_cgroup_cancel_charge(from, nr_pages); /* caller should have done css_get */ pc->mem_cgroup = to; - mem_cgroup_charge_statistics(to, PageCgroupCache(pc), nr_pages); + mem_cgroup_charge_statistics(to, file, nr_pages); /* * We charges against "to" which may not have any tasks. Then, "to" * can be under rmdir(). But in current implementation, caller of @@ -2854,6 +2859,7 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype) struct mem_cgroup *memcg = NULL; unsigned int nr_pages = 1; struct page_cgroup *pc; + bool file = false; if (mem_cgroup_disabled()) return NULL; @@ -2880,6 +2886,10 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype) goto unlock_out; switch (ctype) { + case MEM_CGROUP_CHARGE_TYPE_CACHE: + case MEM_CGROUP_CHARGE_TYPE_SHMEM: + file = true; + break; case MEM_CGROUP_CHARGE_TYPE_MAPPED: case MEM_CGROUP_CHARGE_TYPE_DROP: /* See mem_cgroup_prepare_migration() */ @@ -2897,7 +2907,7 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype) break; } - mem_cgroup_charge_statistics(memcg, PageCgroupCache(pc), -nr_pages); + mem_cgroup_charge_statistics(memcg, file, -nr_pages); ClearPageCgroupUsed(pc); /* @@ -2938,9 +2948,13 @@ void mem_cgroup_uncharge_page(struct page *page) void mem_cgroup_uncharge_cache_page(struct page *page) { + int ctype = MEM_CGROUP_CHARGE_TYPE_CACHE; VM_BUG_ON(page_mapped(page)); VM_BUG_ON(page->mapping); - __mem_cgroup_uncharge_common(page, MEM_CGROUP_CHARGE_TYPE_CACHE); + + if (page_is_file_cache(page)) + ctype = MEM_CGROUP_CHARGE_TYPE_SHMEM; + __mem_cgroup_uncharge_common(page, ctype); } /* @@ -3276,7 +3290,7 @@ void mem_cgroup_replace_page_cache(struct page *oldpage, /* fix accounting on old pages */ lock_page_cgroup(pc); memcg = pc->mem_cgroup; - mem_cgroup_charge_statistics(memcg, PageCgroupCache(pc), -1); + mem_cgroup_charge_statistics(memcg, true, -1); ClearPageCgroupUsed(pc); unlock_page_cgroup(pc); -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC][PATCH 4/5] memcg: remove PCG_CACHE bit 2011-12-15 12:04 ` KAMEZAWA Hiroyuki @ 2011-12-15 14:54 ` Johannes Weiner 0 siblings, 0 replies; 13+ messages in thread From: Johannes Weiner @ 2011-12-15 14:54 UTC (permalink / raw) To: KAMEZAWA Hiroyuki Cc: linux-mm, akpm, Michal Hocko, Balbir Singh, Hugh Dickins, Ying Han, nishimura On Thu, Dec 15, 2011 at 09:04:06PM +0900, KAMEZAWA Hiroyuki wrote: > On Thu, 15 Dec 2011 19:36:31 +0900 > KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote: > > > On Thu, 15 Dec 2011 11:24:42 +0100 > > Johannes Weiner <hannes@cmpxchg.org> wrote: > > > > What I think is required is to break up the charging and committing > > > like we do for swap cache already: > > > > > > if (!mem_cgroup_try_charge()) > > > goto error; > > > page_add_new_anon_rmap() > > > mem_cgroup_commit() > > > > > > This will also allow us to even get rid of passing around the charge > > > type everywhere... > > > > > > > Thank you. I'll look into. > > > > To be honest, I want to remove 'rss' and 'cache' counter ;( > > This doesn't have much meanings after lru was splitted. > > > > I'll use this version for test. This patch is under far deep stacks of > unmerged patches, anyway. Ok, makes sense. I can do the PCG_CACHE removal, btw, I have half the patches sitting around anyway, just need to fix up huge_memory.c. > @@ -2938,9 +2948,13 @@ void mem_cgroup_uncharge_page(struct page *page) > > void mem_cgroup_uncharge_cache_page(struct page *page) > { > + int ctype = MEM_CGROUP_CHARGE_TYPE_CACHE; > VM_BUG_ON(page_mapped(page)); > VM_BUG_ON(page->mapping); > - __mem_cgroup_uncharge_common(page, MEM_CGROUP_CHARGE_TYPE_CACHE); > + > + if (page_is_file_cache(page)) > + ctype = MEM_CGROUP_CHARGE_TYPE_SHMEM; > + __mem_cgroup_uncharge_common(page, ctype); I think this is missing a negation, but it doesn't matter because the SHMEM and CACHE charge types are treated exactly the same way. I'll send a patch series that removes it soon, there is more shmem related things... -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
* [RFC] [PATCH 5/5] memcg: remove PCG_MOVE_LOCK 2011-12-15 6:00 [Experimental] [PATCH 0/5] page_cgroup->flags diet KAMEZAWA Hiroyuki ` (3 preceding siblings ...) 2011-12-15 6:08 ` [RFC][PATCH 4/5] memcg: remove PCG_CACHE bit KAMEZAWA Hiroyuki @ 2011-12-15 6:10 ` KAMEZAWA Hiroyuki 2011-12-15 8:44 ` [Experimental] [PATCH 0/5] page_cgroup->flags diet KAMEZAWA Hiroyuki 5 siblings, 0 replies; 13+ messages in thread From: KAMEZAWA Hiroyuki @ 2011-12-15 6:10 UTC (permalink / raw) To: KAMEZAWA Hiroyuki Cc: linux-mm, akpm, hannes, Michal Hocko, Balbir Singh, Hugh Dickins, Ying Han, nishimura ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Experimental] [PATCH 0/5] page_cgroup->flags diet. 2011-12-15 6:00 [Experimental] [PATCH 0/5] page_cgroup->flags diet KAMEZAWA Hiroyuki ` (4 preceding siblings ...) 2011-12-15 6:10 ` [RFC] [PATCH 5/5] memcg: remove PCG_MOVE_LOCK KAMEZAWA Hiroyuki @ 2011-12-15 8:44 ` KAMEZAWA Hiroyuki 5 siblings, 0 replies; 13+ messages in thread From: KAMEZAWA Hiroyuki @ 2011-12-15 8:44 UTC (permalink / raw) To: KAMEZAWA Hiroyuki Cc: linux-mm, akpm, hannes, Michal Hocko, Balbir Singh, Hugh Dickins, Ying Han, nishimura This untested patch is for reducing size of page_cgroup to be 8 bytes. After enough tests, we'll be ready to integrate page_cgroup as a member of struct page (with CONFIG?) I'll start tests when I can.. BTW, I don't consider how to track blkio owner for supporting buffered I/O in blkio cgroup. But I wonder it's enough to add an interface to tie memcg and blkio cgroup even if they are not bind-mounted... Then, blkio_id can be gotten by page-> memcg -> blkio_id. Another idea is using page->private (or some) for recording who is the writer This info can be propageted to buffer_head or bio. Worst idea is adding a new field to page_cgroup. == diff --git a/include/linux/page_cgroup.h b/include/linux/page_cgroup.h index f9441ca..48be740 100644 --- a/include/linux/page_cgroup.h +++ b/include/linux/page_cgroup.h @@ -23,8 +23,7 @@ enum { * then the page cgroup for pfn always exists. */ struct page_cgroup { - unsigned long flags; - struct mem_cgroup *mem_cgroup; + unsigned long _flags; /* This flag only uses lower 3bits */ }; void __meminit pgdat_page_cgroup_init(struct pglist_data *pgdat); @@ -46,19 +45,19 @@ struct page *lookup_cgroup_page(struct page_cgroup *pc); #define TESTPCGFLAG(uname, lname) \ static inline int PageCgroup##uname(struct page_cgroup *pc) \ - { return test_bit(PCG_##lname, &pc->flags); } + { return test_bit(PCG_##lname, &pc->_flags); } #define SETPCGFLAG(uname, lname) \ static inline void SetPageCgroup##uname(struct page_cgroup *pc)\ - { set_bit(PCG_##lname, &pc->flags); } + { set_bit(PCG_##lname, &pc->_flags); } #define CLEARPCGFLAG(uname, lname) \ static inline void ClearPageCgroup##uname(struct page_cgroup *pc) \ - { clear_bit(PCG_##lname, &pc->flags); } + { clear_bit(PCG_##lname, &pc->_flags); } #define TESTCLEARPCGFLAG(uname, lname) \ static inline int TestClearPageCgroup##uname(struct page_cgroup *pc) \ - { return test_and_clear_bit(PCG_##lname, &pc->flags); } + { return test_and_clear_bit(PCG_##lname, &pc->_flags); } TESTPCGFLAG(Used, USED) CLEARPCGFLAG(Used, USED) @@ -68,18 +67,33 @@ SETPCGFLAG(Migration, MIGRATION) CLEARPCGFLAG(Migration, MIGRATION) TESTPCGFLAG(Migration, MIGRATION) +#define PCG_FLAG_MASK ((1 << (__NR_PCG_FLAGS)) - 1) + static inline void lock_page_cgroup(struct page_cgroup *pc) { /* * Don't take this lock in IRQ context. * This lock is for pc->mem_cgroup, USED, CACHE, MIGRATION */ - bit_spin_lock(PCG_LOCK, &pc->flags); + bit_spin_lock(PCG_LOCK, &pc->_flags); } static inline void unlock_page_cgroup(struct page_cgroup *pc) { - bit_spin_unlock(PCG_LOCK, &pc->flags); + bit_spin_unlock(PCG_LOCK, &pc->_flags); +} + +static inline struct mem_cgroup *pc_to_memcg(struct page_cgroup *pc) +{ + return (struct mem_cgroup *) + ((unsigned long)pc->_flags & ~PCG_FLAG_MASK); +} + +static inline void +pc_set_memcg(struct page_cgroup *pc, struct mem_cgroup *memcg) +{ + unsigned long val = pc->_flags & PCG_FLAG_MASK; + pc->_flags = (unsigned long)memcg | val; } #else /* CONFIG_CGROUP_MEM_RES_CTLR */ diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 66e03ad..8750e5a 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1103,7 +1103,7 @@ struct lruvec *mem_cgroup_lru_add_list(struct zone *zone, struct page *page, return &zone->lruvec; pc = lookup_page_cgroup(page); - memcg = pc->mem_cgroup; + memcg = pc_to_memcg(pc); VM_BUG_ON(!memcg); mz = page_cgroup_zoneinfo(memcg, page); /* compound_order() is stabilized through lru_lock */ @@ -1131,7 +1131,7 @@ void mem_cgroup_lru_del_list(struct page *page, enum lru_list lru) return; pc = lookup_page_cgroup(page); - memcg = pc->mem_cgroup; + memcg = pc_to_memcg(pc); VM_BUG_ON(!memcg); mz = page_cgroup_zoneinfo(memcg, page); /* huge page split is done under lru_lock. so, we have no races. */ @@ -1268,7 +1268,7 @@ mem_cgroup_get_reclaim_stat_from_page(struct page *page) return NULL; /* Ensure pc->mem_cgroup is visible after reading PCG_USED. */ smp_rmb(); - mz = page_cgroup_zoneinfo(pc->mem_cgroup, page); + mz = page_cgroup_zoneinfo(pc_to_memcg(pc), page); return &mz->reclaim_stat; } @@ -1897,7 +1897,7 @@ bool __mem_cgroup_begin_update_page_stats(struct page *page, unsigned long *flag bool need_unlock = false; rcu_read_lock(); - memcg = pc->mem_cgroup; + memcg = pc_to_memcg(pc); if (!memcg || !PageCgroupUsed(pc)) goto out; if (unlikely(mem_cgroup_stealed(memcg)) || PageTransHuge(page)) { @@ -1926,7 +1926,7 @@ void __mem_cgroup_update_page_stat(struct page *page, enum mem_cgroup_page_stat_item idx, int val) { struct page_cgroup *pc = lookup_page_cgroup(page); - struct mem_cgroup *memcg = pc->mem_cgroup; + struct mem_cgroup *memcg = pc_to_memcg(pc); if (!memcg || !PageCgroupUsed(pc)) return; @@ -2400,7 +2400,7 @@ struct mem_cgroup *try_get_mem_cgroup_from_page(struct page *page) pc = lookup_page_cgroup(page); lock_page_cgroup(pc); if (PageCgroupUsed(pc)) { - memcg = pc->mem_cgroup; + memcg = pc_to_memcg(pc); if (memcg && !css_tryget(&memcg->css)) memcg = NULL; } else if (PageSwapCache(page)) { @@ -2434,7 +2434,7 @@ static void __mem_cgroup_commit_charge(struct mem_cgroup *memcg, * we don't need page_cgroup_lock about tail pages, becase they are not * accessed by any other context at this point. */ - pc->mem_cgroup = memcg; + pc_set_memcg(pc, memcg); /* * We access a page_cgroup asynchronously without lock_page_cgroup(). * Especially when a page_cgroup is taken from a page, pc->mem_cgroup @@ -2469,7 +2469,6 @@ static void __mem_cgroup_commit_charge(struct mem_cgroup *memcg, #ifdef CONFIG_TRANSPARENT_HUGEPAGE -#define PCGF_NOCOPY_AT_SPLIT ((1 << PCG_LOCK) | (1 << PCG_MIGRATION)) /* * Because tail pages are not marked as "used", set it. We're under * zone->lru_lock, 'splitting on pmd' and compound_lock. @@ -2481,23 +2480,26 @@ void mem_cgroup_split_huge_fixup(struct page *head) struct page_cgroup *head_pc = lookup_page_cgroup(head); struct page_cgroup *pc; struct mem_cgroup_per_zone *mz; + struct mem_cgroup *head_memcg; enum lru_list lru; int i; if (mem_cgroup_disabled()) return; + head_memcg = pc_to_memcg(head_pc); for (i = 1; i < HPAGE_PMD_NR; i++) { pc = head_pc + i; - pc->mem_cgroup = head_pc->mem_cgroup; + pc_set_memcg(pc, head_memcg); smp_wmb();/* see __commit_charge() */ - pc->flags = head_pc->flags & ~PCGF_NOCOPY_AT_SPLIT; + /* this page is never be under page migration */ + SetPageCgroupUsed(pc); } /* * Tail pages will be added to LRU. * We hold lru_lock,then,reduce counter directly. */ lru = page_lru(head); - mz = page_cgroup_zoneinfo(head_pc->mem_cgroup, head); + mz = page_cgroup_zoneinfo(head_memcg, head); MEM_CGROUP_ZSTAT(mz, lru) -= HPAGE_PMD_NR - 1; } #endif @@ -2545,7 +2547,7 @@ static int mem_cgroup_move_account(struct page *page, lock_page_cgroup(pc); ret = -EINVAL; - if (!PageCgroupUsed(pc) || pc->mem_cgroup != from) + if (!PageCgroupUsed(pc) || pc_to_memcg(pc) != from) goto unlock; mem_cgroup_move_account_wlock(page, &flags); @@ -2563,7 +2565,7 @@ static int mem_cgroup_move_account(struct page *page, __mem_cgroup_cancel_charge(from, nr_pages); /* caller should have done css_get */ - pc->mem_cgroup = to; + pc_set_memcg(pc, to); mem_cgroup_charge_statistics(to, !PageAnon(page), nr_pages); /* * We charges against "to" which may not have any tasks. Then, "to" @@ -2928,7 +2930,7 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype) lock_page_cgroup(pc); - memcg = pc->mem_cgroup; + memcg = pc_to_memcg(pc); if (!PageCgroupUsed(pc)) goto unlock_out; @@ -3109,7 +3111,7 @@ void mem_cgroup_reset_owner(struct page *newpage) pc = lookup_page_cgroup(newpage); VM_BUG_ON(PageCgroupUsed(pc)); - pc->mem_cgroup = root_mem_cgroup; + pc_set_memcg(pc, root_mem_cgroup); } /** @@ -3191,7 +3193,7 @@ int mem_cgroup_prepare_migration(struct page *page, pc = lookup_page_cgroup(page); lock_page_cgroup(pc); if (PageCgroupUsed(pc)) { - memcg = pc->mem_cgroup; + memcg = pc_to_memcg(pc); css_get(&memcg->css); /* * At migrating an anonymous page, its mapcount goes down @@ -3329,7 +3331,7 @@ void mem_cgroup_replace_page_cache(struct page *oldpage, pc = lookup_page_cgroup(oldpage); /* fix accounting on old pages */ lock_page_cgroup(pc); - memcg = pc->mem_cgroup; + memcg = pc_to_memcg(pc); mem_cgroup_charge_statistics(memcg, !PageAnon(oldpage), -1); ClearPageCgroupUsed(pc); unlock_page_cgroup(pc); @@ -3376,14 +3378,15 @@ void mem_cgroup_print_bad_page(struct page *page) if (pc) { int ret = -1; char *path; + struct mem_cgroup *memcg = pc_to_memcg(pc); printk(KERN_ALERT "pc:%p pc->flags:%lx pc->mem_cgroup:%p", - pc, pc->flags, pc->mem_cgroup); + pc, pc->_flags, memcg); path = kmalloc(PATH_MAX, GFP_KERNEL); if (path) { rcu_read_lock(); - ret = cgroup_path(pc->mem_cgroup->css.cgroup, + ret = cgroup_path(memcg->css.cgroup, path, PATH_MAX); rcu_read_unlock(); } @@ -5247,7 +5250,7 @@ static int is_target_pte_for_mc(struct vm_area_struct *vma, * mem_cgroup_move_account() checks the pc is valid or not under * the lock. */ - if (PageCgroupUsed(pc) && pc->mem_cgroup == mc.from) { + if (PageCgroupUsed(pc) && pc_to_memcg(pc) == mc.from) { ret = MC_TARGET_PAGE; if (target) target->page = page; -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2011-12-15 14:55 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2011-12-15 6:00 [Experimental] [PATCH 0/5] page_cgroup->flags diet KAMEZAWA Hiroyuki 2011-12-15 6:05 ` [RFC] [PATCH 1/5] memcg: simplify account moving check KAMEZAWA Hiroyuki 2011-12-15 10:04 ` Johannes Weiner 2011-12-15 10:17 ` KAMEZAWA Hiroyuki 2011-12-15 6:06 ` [RFC][PATCH 2/5] memcg: safer page stat updating KAMEZAWA Hiroyuki 2011-12-15 6:07 ` [RFC][PATCH 3/5] memcg: remove PCG_FILE_MAPPED KAMEZAWA Hiroyuki 2011-12-15 6:08 ` [RFC][PATCH 4/5] memcg: remove PCG_CACHE bit KAMEZAWA Hiroyuki 2011-12-15 10:24 ` Johannes Weiner 2011-12-15 10:36 ` KAMEZAWA Hiroyuki 2011-12-15 12:04 ` KAMEZAWA Hiroyuki 2011-12-15 14:54 ` Johannes Weiner 2011-12-15 6:10 ` [RFC] [PATCH 5/5] memcg: remove PCG_MOVE_LOCK KAMEZAWA Hiroyuki 2011-12-15 8:44 ` [Experimental] [PATCH 0/5] page_cgroup->flags diet KAMEZAWA Hiroyuki
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox