* [Experimental] [PATCH 0/5] page_cgroup->flags diet.
@ 2011-12-15 6:00 KAMEZAWA Hiroyuki
2011-12-15 6:05 ` [RFC] [PATCH 1/5] memcg: simplify account moving check KAMEZAWA Hiroyuki
` (5 more replies)
0 siblings, 6 replies; 13+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-12-15 6:00 UTC (permalink / raw)
To: linux-mm
Cc: akpm, hannes, Michal Hocko, Balbir Singh, Hugh Dickins, Ying Han,
nishimura
For reducing size of page_cgroup, we need to reduce flags.
This patch is a trial to remove flags based on
linux-next +
memcg-add-mem_cgroup_replace_page_cache-to-fix-lru-issue.patch +
4 patches for 'simplify LRU handling' I posted.
So, I don't ask anyone to test this but want to hear another idea
or comments on implemenation.
After this patch series, page_cgroup flags are
enum {
/* flags for mem_cgroup */
PCG_LOCK, /* Lock for pc->mem_cgroup and following bits. */
PCG_USED, /* this object is in use. */
PCG_MIGRATION, /* under page migration */
__NR_PCG_FLAGS,
};
3bits. I thought I could remove PCG_MIGRATION ....but failed.
BTW, because of kmalloc()'s alignment, low 3bits of pc->mem_cgroup must be 0.
So, we can move these flags to low bits in pc->mem_cgroup...I guess.
(But in another thoughts, we need to track blkio per page finally. So, I'm
not sure whether we can remove page_cgroup at the end.)
Anyway, I dump a current trial series. Please comment when you have time.
I'm not in hurry and will not be able to make a quick response.
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 13+ messages in thread
* [RFC] [PATCH 1/5] memcg: simplify account moving check
2011-12-15 6:00 [Experimental] [PATCH 0/5] page_cgroup->flags diet KAMEZAWA Hiroyuki
@ 2011-12-15 6:05 ` KAMEZAWA Hiroyuki
2011-12-15 10:04 ` Johannes Weiner
2011-12-15 6:06 ` [RFC][PATCH 2/5] memcg: safer page stat updating KAMEZAWA Hiroyuki
` (4 subsequent siblings)
5 siblings, 1 reply; 13+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-12-15 6:05 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm, akpm, hannes, Michal Hocko, Balbir Singh, Hugh Dickins,
Ying Han, nishimura
^ permalink raw reply [flat|nested] 13+ messages in thread
* [RFC][PATCH 2/5] memcg: safer page stat updating
2011-12-15 6:00 [Experimental] [PATCH 0/5] page_cgroup->flags diet KAMEZAWA Hiroyuki
2011-12-15 6:05 ` [RFC] [PATCH 1/5] memcg: simplify account moving check KAMEZAWA Hiroyuki
@ 2011-12-15 6:06 ` KAMEZAWA Hiroyuki
2011-12-15 6:07 ` [RFC][PATCH 3/5] memcg: remove PCG_FILE_MAPPED KAMEZAWA Hiroyuki
` (3 subsequent siblings)
5 siblings, 0 replies; 13+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-12-15 6:06 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm, akpm, hannes, Michal Hocko, Balbir Singh, Hugh Dickins,
Ying Han, nishimura
better (lockless) idea is welcomed.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [RFC][PATCH 3/5] memcg: remove PCG_FILE_MAPPED
2011-12-15 6:00 [Experimental] [PATCH 0/5] page_cgroup->flags diet KAMEZAWA Hiroyuki
2011-12-15 6:05 ` [RFC] [PATCH 1/5] memcg: simplify account moving check KAMEZAWA Hiroyuki
2011-12-15 6:06 ` [RFC][PATCH 2/5] memcg: safer page stat updating KAMEZAWA Hiroyuki
@ 2011-12-15 6:07 ` KAMEZAWA Hiroyuki
2011-12-15 6:08 ` [RFC][PATCH 4/5] memcg: remove PCG_CACHE bit KAMEZAWA Hiroyuki
` (2 subsequent siblings)
5 siblings, 0 replies; 13+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-12-15 6:07 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm, akpm, hannes, Michal Hocko, Balbir Singh, Hugh Dickins,
Ying Han, nishimura
^ permalink raw reply [flat|nested] 13+ messages in thread
* [RFC][PATCH 4/5] memcg: remove PCG_CACHE bit
2011-12-15 6:00 [Experimental] [PATCH 0/5] page_cgroup->flags diet KAMEZAWA Hiroyuki
` (2 preceding siblings ...)
2011-12-15 6:07 ` [RFC][PATCH 3/5] memcg: remove PCG_FILE_MAPPED KAMEZAWA Hiroyuki
@ 2011-12-15 6:08 ` KAMEZAWA Hiroyuki
2011-12-15 10:24 ` Johannes Weiner
2011-12-15 6:10 ` [RFC] [PATCH 5/5] memcg: remove PCG_MOVE_LOCK KAMEZAWA Hiroyuki
2011-12-15 8:44 ` [Experimental] [PATCH 0/5] page_cgroup->flags diet KAMEZAWA Hiroyuki
5 siblings, 1 reply; 13+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-12-15 6:08 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm, akpm, hannes, Michal Hocko, Balbir Singh, Hugh Dickins,
Ying Han, nishimura
^ permalink raw reply [flat|nested] 13+ messages in thread
* [RFC] [PATCH 5/5] memcg: remove PCG_MOVE_LOCK
2011-12-15 6:00 [Experimental] [PATCH 0/5] page_cgroup->flags diet KAMEZAWA Hiroyuki
` (3 preceding siblings ...)
2011-12-15 6:08 ` [RFC][PATCH 4/5] memcg: remove PCG_CACHE bit KAMEZAWA Hiroyuki
@ 2011-12-15 6:10 ` KAMEZAWA Hiroyuki
2011-12-15 8:44 ` [Experimental] [PATCH 0/5] page_cgroup->flags diet KAMEZAWA Hiroyuki
5 siblings, 0 replies; 13+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-12-15 6:10 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm, akpm, hannes, Michal Hocko, Balbir Singh, Hugh Dickins,
Ying Han, nishimura
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Experimental] [PATCH 0/5] page_cgroup->flags diet.
2011-12-15 6:00 [Experimental] [PATCH 0/5] page_cgroup->flags diet KAMEZAWA Hiroyuki
` (4 preceding siblings ...)
2011-12-15 6:10 ` [RFC] [PATCH 5/5] memcg: remove PCG_MOVE_LOCK KAMEZAWA Hiroyuki
@ 2011-12-15 8:44 ` KAMEZAWA Hiroyuki
5 siblings, 0 replies; 13+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-12-15 8:44 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm, akpm, hannes, Michal Hocko, Balbir Singh, Hugh Dickins,
Ying Han, nishimura
This untested patch is for reducing size of page_cgroup to be 8 bytes.
After enough tests, we'll be ready to integrate page_cgroup as a member of
struct page (with CONFIG?)
I'll start tests when I can..
BTW, I don't consider how to track blkio owner for supporting buffered I/O
in blkio cgroup. But I wonder it's enough to add an interface to tie memcg
and blkio cgroup even if they are not bind-mounted...
Then, blkio_id can be gotten by page-> memcg -> blkio_id.
Another idea is using page->private (or some) for recording who is the writer
This info can be propageted to buffer_head or bio.
Worst idea is adding a new field to page_cgroup.
==
diff --git a/include/linux/page_cgroup.h b/include/linux/page_cgroup.h
index f9441ca..48be740 100644
--- a/include/linux/page_cgroup.h
+++ b/include/linux/page_cgroup.h
@@ -23,8 +23,7 @@ enum {
* then the page cgroup for pfn always exists.
*/
struct page_cgroup {
- unsigned long flags;
- struct mem_cgroup *mem_cgroup;
+ unsigned long _flags; /* This flag only uses lower 3bits */
};
void __meminit pgdat_page_cgroup_init(struct pglist_data *pgdat);
@@ -46,19 +45,19 @@ struct page *lookup_cgroup_page(struct page_cgroup *pc);
#define TESTPCGFLAG(uname, lname) \
static inline int PageCgroup##uname(struct page_cgroup *pc) \
- { return test_bit(PCG_##lname, &pc->flags); }
+ { return test_bit(PCG_##lname, &pc->_flags); }
#define SETPCGFLAG(uname, lname) \
static inline void SetPageCgroup##uname(struct page_cgroup *pc)\
- { set_bit(PCG_##lname, &pc->flags); }
+ { set_bit(PCG_##lname, &pc->_flags); }
#define CLEARPCGFLAG(uname, lname) \
static inline void ClearPageCgroup##uname(struct page_cgroup *pc) \
- { clear_bit(PCG_##lname, &pc->flags); }
+ { clear_bit(PCG_##lname, &pc->_flags); }
#define TESTCLEARPCGFLAG(uname, lname) \
static inline int TestClearPageCgroup##uname(struct page_cgroup *pc) \
- { return test_and_clear_bit(PCG_##lname, &pc->flags); }
+ { return test_and_clear_bit(PCG_##lname, &pc->_flags); }
TESTPCGFLAG(Used, USED)
CLEARPCGFLAG(Used, USED)
@@ -68,18 +67,33 @@ SETPCGFLAG(Migration, MIGRATION)
CLEARPCGFLAG(Migration, MIGRATION)
TESTPCGFLAG(Migration, MIGRATION)
+#define PCG_FLAG_MASK ((1 << (__NR_PCG_FLAGS)) - 1)
+
static inline void lock_page_cgroup(struct page_cgroup *pc)
{
/*
* Don't take this lock in IRQ context.
* This lock is for pc->mem_cgroup, USED, CACHE, MIGRATION
*/
- bit_spin_lock(PCG_LOCK, &pc->flags);
+ bit_spin_lock(PCG_LOCK, &pc->_flags);
}
static inline void unlock_page_cgroup(struct page_cgroup *pc)
{
- bit_spin_unlock(PCG_LOCK, &pc->flags);
+ bit_spin_unlock(PCG_LOCK, &pc->_flags);
+}
+
+static inline struct mem_cgroup *pc_to_memcg(struct page_cgroup *pc)
+{
+ return (struct mem_cgroup *)
+ ((unsigned long)pc->_flags & ~PCG_FLAG_MASK);
+}
+
+static inline void
+pc_set_memcg(struct page_cgroup *pc, struct mem_cgroup *memcg)
+{
+ unsigned long val = pc->_flags & PCG_FLAG_MASK;
+ pc->_flags = (unsigned long)memcg | val;
}
#else /* CONFIG_CGROUP_MEM_RES_CTLR */
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 66e03ad..8750e5a 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1103,7 +1103,7 @@ struct lruvec *mem_cgroup_lru_add_list(struct zone *zone, struct page *page,
return &zone->lruvec;
pc = lookup_page_cgroup(page);
- memcg = pc->mem_cgroup;
+ memcg = pc_to_memcg(pc);
VM_BUG_ON(!memcg);
mz = page_cgroup_zoneinfo(memcg, page);
/* compound_order() is stabilized through lru_lock */
@@ -1131,7 +1131,7 @@ void mem_cgroup_lru_del_list(struct page *page, enum lru_list lru)
return;
pc = lookup_page_cgroup(page);
- memcg = pc->mem_cgroup;
+ memcg = pc_to_memcg(pc);
VM_BUG_ON(!memcg);
mz = page_cgroup_zoneinfo(memcg, page);
/* huge page split is done under lru_lock. so, we have no races. */
@@ -1268,7 +1268,7 @@ mem_cgroup_get_reclaim_stat_from_page(struct page *page)
return NULL;
/* Ensure pc->mem_cgroup is visible after reading PCG_USED. */
smp_rmb();
- mz = page_cgroup_zoneinfo(pc->mem_cgroup, page);
+ mz = page_cgroup_zoneinfo(pc_to_memcg(pc), page);
return &mz->reclaim_stat;
}
@@ -1897,7 +1897,7 @@ bool __mem_cgroup_begin_update_page_stats(struct page *page, unsigned long *flag
bool need_unlock = false;
rcu_read_lock();
- memcg = pc->mem_cgroup;
+ memcg = pc_to_memcg(pc);
if (!memcg || !PageCgroupUsed(pc))
goto out;
if (unlikely(mem_cgroup_stealed(memcg)) || PageTransHuge(page)) {
@@ -1926,7 +1926,7 @@ void __mem_cgroup_update_page_stat(struct page *page,
enum mem_cgroup_page_stat_item idx, int val)
{
struct page_cgroup *pc = lookup_page_cgroup(page);
- struct mem_cgroup *memcg = pc->mem_cgroup;
+ struct mem_cgroup *memcg = pc_to_memcg(pc);
if (!memcg || !PageCgroupUsed(pc))
return;
@@ -2400,7 +2400,7 @@ struct mem_cgroup *try_get_mem_cgroup_from_page(struct page *page)
pc = lookup_page_cgroup(page);
lock_page_cgroup(pc);
if (PageCgroupUsed(pc)) {
- memcg = pc->mem_cgroup;
+ memcg = pc_to_memcg(pc);
if (memcg && !css_tryget(&memcg->css))
memcg = NULL;
} else if (PageSwapCache(page)) {
@@ -2434,7 +2434,7 @@ static void __mem_cgroup_commit_charge(struct mem_cgroup *memcg,
* we don't need page_cgroup_lock about tail pages, becase they are not
* accessed by any other context at this point.
*/
- pc->mem_cgroup = memcg;
+ pc_set_memcg(pc, memcg);
/*
* We access a page_cgroup asynchronously without lock_page_cgroup().
* Especially when a page_cgroup is taken from a page, pc->mem_cgroup
@@ -2469,7 +2469,6 @@ static void __mem_cgroup_commit_charge(struct mem_cgroup *memcg,
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
-#define PCGF_NOCOPY_AT_SPLIT ((1 << PCG_LOCK) | (1 << PCG_MIGRATION))
/*
* Because tail pages are not marked as "used", set it. We're under
* zone->lru_lock, 'splitting on pmd' and compound_lock.
@@ -2481,23 +2480,26 @@ void mem_cgroup_split_huge_fixup(struct page *head)
struct page_cgroup *head_pc = lookup_page_cgroup(head);
struct page_cgroup *pc;
struct mem_cgroup_per_zone *mz;
+ struct mem_cgroup *head_memcg;
enum lru_list lru;
int i;
if (mem_cgroup_disabled())
return;
+ head_memcg = pc_to_memcg(head_pc);
for (i = 1; i < HPAGE_PMD_NR; i++) {
pc = head_pc + i;
- pc->mem_cgroup = head_pc->mem_cgroup;
+ pc_set_memcg(pc, head_memcg);
smp_wmb();/* see __commit_charge() */
- pc->flags = head_pc->flags & ~PCGF_NOCOPY_AT_SPLIT;
+ /* this page is never be under page migration */
+ SetPageCgroupUsed(pc);
}
/*
* Tail pages will be added to LRU.
* We hold lru_lock,then,reduce counter directly.
*/
lru = page_lru(head);
- mz = page_cgroup_zoneinfo(head_pc->mem_cgroup, head);
+ mz = page_cgroup_zoneinfo(head_memcg, head);
MEM_CGROUP_ZSTAT(mz, lru) -= HPAGE_PMD_NR - 1;
}
#endif
@@ -2545,7 +2547,7 @@ static int mem_cgroup_move_account(struct page *page,
lock_page_cgroup(pc);
ret = -EINVAL;
- if (!PageCgroupUsed(pc) || pc->mem_cgroup != from)
+ if (!PageCgroupUsed(pc) || pc_to_memcg(pc) != from)
goto unlock;
mem_cgroup_move_account_wlock(page, &flags);
@@ -2563,7 +2565,7 @@ static int mem_cgroup_move_account(struct page *page,
__mem_cgroup_cancel_charge(from, nr_pages);
/* caller should have done css_get */
- pc->mem_cgroup = to;
+ pc_set_memcg(pc, to);
mem_cgroup_charge_statistics(to, !PageAnon(page), nr_pages);
/*
* We charges against "to" which may not have any tasks. Then, "to"
@@ -2928,7 +2930,7 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype)
lock_page_cgroup(pc);
- memcg = pc->mem_cgroup;
+ memcg = pc_to_memcg(pc);
if (!PageCgroupUsed(pc))
goto unlock_out;
@@ -3109,7 +3111,7 @@ void mem_cgroup_reset_owner(struct page *newpage)
pc = lookup_page_cgroup(newpage);
VM_BUG_ON(PageCgroupUsed(pc));
- pc->mem_cgroup = root_mem_cgroup;
+ pc_set_memcg(pc, root_mem_cgroup);
}
/**
@@ -3191,7 +3193,7 @@ int mem_cgroup_prepare_migration(struct page *page,
pc = lookup_page_cgroup(page);
lock_page_cgroup(pc);
if (PageCgroupUsed(pc)) {
- memcg = pc->mem_cgroup;
+ memcg = pc_to_memcg(pc);
css_get(&memcg->css);
/*
* At migrating an anonymous page, its mapcount goes down
@@ -3329,7 +3331,7 @@ void mem_cgroup_replace_page_cache(struct page *oldpage,
pc = lookup_page_cgroup(oldpage);
/* fix accounting on old pages */
lock_page_cgroup(pc);
- memcg = pc->mem_cgroup;
+ memcg = pc_to_memcg(pc);
mem_cgroup_charge_statistics(memcg, !PageAnon(oldpage), -1);
ClearPageCgroupUsed(pc);
unlock_page_cgroup(pc);
@@ -3376,14 +3378,15 @@ void mem_cgroup_print_bad_page(struct page *page)
if (pc) {
int ret = -1;
char *path;
+ struct mem_cgroup *memcg = pc_to_memcg(pc);
printk(KERN_ALERT "pc:%p pc->flags:%lx pc->mem_cgroup:%p",
- pc, pc->flags, pc->mem_cgroup);
+ pc, pc->_flags, memcg);
path = kmalloc(PATH_MAX, GFP_KERNEL);
if (path) {
rcu_read_lock();
- ret = cgroup_path(pc->mem_cgroup->css.cgroup,
+ ret = cgroup_path(memcg->css.cgroup,
path, PATH_MAX);
rcu_read_unlock();
}
@@ -5247,7 +5250,7 @@ static int is_target_pte_for_mc(struct vm_area_struct *vma,
* mem_cgroup_move_account() checks the pc is valid or not under
* the lock.
*/
- if (PageCgroupUsed(pc) && pc->mem_cgroup == mc.from) {
+ if (PageCgroupUsed(pc) && pc_to_memcg(pc) == mc.from) {
ret = MC_TARGET_PAGE;
if (target)
target->page = page;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC] [PATCH 1/5] memcg: simplify account moving check
2011-12-15 6:05 ` [RFC] [PATCH 1/5] memcg: simplify account moving check KAMEZAWA Hiroyuki
@ 2011-12-15 10:04 ` Johannes Weiner
2011-12-15 10:17 ` KAMEZAWA Hiroyuki
0 siblings, 1 reply; 13+ messages in thread
From: Johannes Weiner @ 2011-12-15 10:04 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm, akpm, Michal Hocko, Balbir Singh, Hugh Dickins,
Ying Han, nishimura
On Thu, Dec 15, 2011 at 03:05:22PM +0900, KAMEZAWA Hiroyuki wrote:
> >From 528f5f2667da17c26e40d271b24691412e1cbe81 Mon Sep 17 00:00:00 2001
> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Date: Thu, 15 Dec 2011 11:41:18 +0900
> Subject: [PATCH 1/5] memcg: simplify account moving check
>
> Now, percpu variable MEM_CGROUP_ON_MOVE is used for indicating that
> a memcg is under move_account() and pc->mem_cgroup under it may be
> overwritten.
>
> But this value is almost read only and not worth to be percpu.
> Using atomic_t instread.
I like this, but I think you can go one further. The only place I see
where the per-cpu counter is actually read is to avoid taking the
lock, but if you make that counter an atomic anyway - why bother?
Couldn't you remove the counter completely and just take move_lock
unconditionally in the page stat updating?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC] [PATCH 1/5] memcg: simplify account moving check
2011-12-15 10:04 ` Johannes Weiner
@ 2011-12-15 10:17 ` KAMEZAWA Hiroyuki
0 siblings, 0 replies; 13+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-12-15 10:17 UTC (permalink / raw)
To: Johannes Weiner
Cc: linux-mm, akpm, Michal Hocko, Balbir Singh, Hugh Dickins,
Ying Han, nishimura
On Thu, 15 Dec 2011 11:04:43 +0100
Johannes Weiner <hannes@cmpxchg.org> wrote:
> On Thu, Dec 15, 2011 at 03:05:22PM +0900, KAMEZAWA Hiroyuki wrote:
> > >From 528f5f2667da17c26e40d271b24691412e1cbe81 Mon Sep 17 00:00:00 2001
> > From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> > Date: Thu, 15 Dec 2011 11:41:18 +0900
> > Subject: [PATCH 1/5] memcg: simplify account moving check
> >
> > Now, percpu variable MEM_CGROUP_ON_MOVE is used for indicating that
> > a memcg is under move_account() and pc->mem_cgroup under it may be
> > overwritten.
> >
> > But this value is almost read only and not worth to be percpu.
> > Using atomic_t instread.
>
> I like this, but I think you can go one further. The only place I see
> where the per-cpu counter is actually read is to avoid taking the
> lock, but if you make that counter an atomic anyway - why bother?
>
> Couldn't you remove the counter completely and just take move_lock
> unconditionally in the page stat updating?
>
Hmm, I (and Greg Thelen) warned that 'please _never_ add atomic ops to
this path' by Peter Zilstra. That 'moving_account' condition checking
was for avoiding atomic ops in this path (for most of cases.)
We'll need to gather enough performance data after implementing per-memcg
dirty ratio accounting. So, could you wait for a while ?
Anyway, my patch '[PATCH 5/5] memcg: remove PCG_MOVE_LOCK' may not work
enough well without the moving_account check. I'll consider more.
Thank you for review.
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC][PATCH 4/5] memcg: remove PCG_CACHE bit
2011-12-15 6:08 ` [RFC][PATCH 4/5] memcg: remove PCG_CACHE bit KAMEZAWA Hiroyuki
@ 2011-12-15 10:24 ` Johannes Weiner
2011-12-15 10:36 ` KAMEZAWA Hiroyuki
0 siblings, 1 reply; 13+ messages in thread
From: Johannes Weiner @ 2011-12-15 10:24 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm, akpm, Michal Hocko, Balbir Singh, Hugh Dickins,
Ying Han, nishimura
On Thu, Dec 15, 2011 at 03:08:22PM +0900, KAMEZAWA Hiroyuki wrote:
> >From 577b441ae259728d83a99baba11bf4925b4542d4 Mon Sep 17 00:00:00 2001
> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Date: Thu, 15 Dec 2011 12:09:03 +0900
> Subject: [PATCH 4/5] memcg: remove PCG_CACHE bit.
>
> This bit can be replaced by PageAnon(page) check.
You need to make sure that only fully rmapped pages are
charged/uncharged, which is not the case yet. E.g. look at the
newpage_charge() in mm/memory.c::do_anonymous_page() - if the page
table is updated by someone else, that uncharge_page() in the error
path will see !PageAnon() because page->mapping is not set yet and
uncharge it as cache.
What I think is required is to break up the charging and committing
like we do for swap cache already:
if (!mem_cgroup_try_charge())
goto error;
page_add_new_anon_rmap()
mem_cgroup_commit()
This will also allow us to even get rid of passing around the charge
type everywhere...
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC][PATCH 4/5] memcg: remove PCG_CACHE bit
2011-12-15 10:24 ` Johannes Weiner
@ 2011-12-15 10:36 ` KAMEZAWA Hiroyuki
2011-12-15 12:04 ` KAMEZAWA Hiroyuki
0 siblings, 1 reply; 13+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-12-15 10:36 UTC (permalink / raw)
To: Johannes Weiner
Cc: linux-mm, akpm, Michal Hocko, Balbir Singh, Hugh Dickins,
Ying Han, nishimura
On Thu, 15 Dec 2011 11:24:42 +0100
Johannes Weiner <hannes@cmpxchg.org> wrote:
> On Thu, Dec 15, 2011 at 03:08:22PM +0900, KAMEZAWA Hiroyuki wrote:
> > >From 577b441ae259728d83a99baba11bf4925b4542d4 Mon Sep 17 00:00:00 2001
> > From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> > Date: Thu, 15 Dec 2011 12:09:03 +0900
> > Subject: [PATCH 4/5] memcg: remove PCG_CACHE bit.
> >
> > This bit can be replaced by PageAnon(page) check.
>
> You need to make sure that only fully rmapped pages are
> charged/uncharged, which is not the case yet. E.g. look at the
> newpage_charge() in mm/memory.c::do_anonymous_page() - if the page
> table is updated by someone else, that uncharge_page() in the error
> path will see !PageAnon() because page->mapping is not set yet and
> uncharge it as cache.
>
Ah, nice catch. I missed the error path.
> What I think is required is to break up the charging and committing
> like we do for swap cache already:
>
> if (!mem_cgroup_try_charge())
> goto error;
> page_add_new_anon_rmap()
> mem_cgroup_commit()
>
> This will also allow us to even get rid of passing around the charge
> type everywhere...
>
Thank you. I'll look into.
To be honest, I want to remove 'rss' and 'cache' counter ;(
This doesn't have much meanings after lru was splitted.
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC][PATCH 4/5] memcg: remove PCG_CACHE bit
2011-12-15 10:36 ` KAMEZAWA Hiroyuki
@ 2011-12-15 12:04 ` KAMEZAWA Hiroyuki
2011-12-15 14:54 ` Johannes Weiner
0 siblings, 1 reply; 13+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-12-15 12:04 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: Johannes Weiner, linux-mm, akpm, Michal Hocko, Balbir Singh,
Hugh Dickins, Ying Han, nishimura
On Thu, 15 Dec 2011 19:36:31 +0900
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> On Thu, 15 Dec 2011 11:24:42 +0100
> Johannes Weiner <hannes@cmpxchg.org> wrote:
> > What I think is required is to break up the charging and committing
> > like we do for swap cache already:
> >
> > if (!mem_cgroup_try_charge())
> > goto error;
> > page_add_new_anon_rmap()
> > mem_cgroup_commit()
> >
> > This will also allow us to even get rid of passing around the charge
> > type everywhere...
> >
>
> Thank you. I'll look into.
>
> To be honest, I want to remove 'rss' and 'cache' counter ;(
> This doesn't have much meanings after lru was splitted.
>
I'll use this version for test. This patch is under far deep stacks of
unmerged patches, anyway.
==
diff --git a/include/linux/page_cgroup.h b/include/linux/page_cgroup.h
index e4cb1bf..86967ed 100644
--- a/include/linux/page_cgroup.h
+++ b/include/linux/page_cgroup.h
@@ -4,7 +4,6 @@
enum {
/* flags for mem_cgroup */
PCG_LOCK, /* Lock for pc->mem_cgroup and following bits. */
- PCG_CACHE, /* charged as cache */
PCG_USED, /* this object is in use. */
PCG_MIGRATION, /* under page migration */
/* flags for mem_cgroup and file and I/O status */
@@ -63,11 +62,6 @@ static inline void ClearPageCgroup##uname(struct page_cgroup *pc) \
static inline int TestClearPageCgroup##uname(struct page_cgroup *pc) \
{ return test_and_clear_bit(PCG_##lname, &pc->flags); }
-/* Cache flag is set only once (at allocation) */
-TESTPCGFLAG(Cache, CACHE)
-CLEARPCGFLAG(Cache, CACHE)
-SETPCGFLAG(Cache, CACHE)
-
TESTPCGFLAG(Used, USED)
CLEARPCGFLAG(Used, USED)
SETPCGFLAG(Used, USED)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index fdcf454..89c76f1 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2368,6 +2368,8 @@ static void __mem_cgroup_commit_charge(struct mem_cgroup *memcg,
struct page_cgroup *pc,
enum charge_type ctype)
{
+ bool file = false;
+
lock_page_cgroup(pc);
if (unlikely(PageCgroupUsed(pc))) {
unlock_page_cgroup(pc);
@@ -2390,18 +2392,17 @@ static void __mem_cgroup_commit_charge(struct mem_cgroup *memcg,
switch (ctype) {
case MEM_CGROUP_CHARGE_TYPE_CACHE:
case MEM_CGROUP_CHARGE_TYPE_SHMEM:
- SetPageCgroupCache(pc);
+ file = true;
SetPageCgroupUsed(pc);
break;
case MEM_CGROUP_CHARGE_TYPE_MAPPED:
- ClearPageCgroupCache(pc);
SetPageCgroupUsed(pc);
break;
default:
break;
}
- mem_cgroup_charge_statistics(memcg, PageCgroupCache(pc), nr_pages);
+ mem_cgroup_charge_statistics(memcg, file, nr_pages);
unlock_page_cgroup(pc);
WARN_ON_ONCE(PageLRU(page));
/*
@@ -2474,6 +2475,7 @@ static int mem_cgroup_move_account(struct page *page,
bool uncharge)
{
unsigned long flags;
+ bool file = false;
int ret;
VM_BUG_ON(from == to);
@@ -2503,14 +2505,17 @@ static int mem_cgroup_move_account(struct page *page,
__this_cpu_inc(to->stat->count[MEM_CGROUP_STAT_FILE_MAPPED]);
preempt_enable();
}
- mem_cgroup_charge_statistics(from, PageCgroupCache(pc), -nr_pages);
+ /* Once PageAnon is set, it will not be cleared until freed. */
+ if (!PageAnon(page))
+ file = true;
+ mem_cgroup_charge_statistics(from, file, -nr_pages);
if (uncharge)
/* This is not "cancel", but cancel_charge does all we need. */
__mem_cgroup_cancel_charge(from, nr_pages);
/* caller should have done css_get */
pc->mem_cgroup = to;
- mem_cgroup_charge_statistics(to, PageCgroupCache(pc), nr_pages);
+ mem_cgroup_charge_statistics(to, file, nr_pages);
/*
* We charges against "to" which may not have any tasks. Then, "to"
* can be under rmdir(). But in current implementation, caller of
@@ -2854,6 +2859,7 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype)
struct mem_cgroup *memcg = NULL;
unsigned int nr_pages = 1;
struct page_cgroup *pc;
+ bool file = false;
if (mem_cgroup_disabled())
return NULL;
@@ -2880,6 +2886,10 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype)
goto unlock_out;
switch (ctype) {
+ case MEM_CGROUP_CHARGE_TYPE_CACHE:
+ case MEM_CGROUP_CHARGE_TYPE_SHMEM:
+ file = true;
+ break;
case MEM_CGROUP_CHARGE_TYPE_MAPPED:
case MEM_CGROUP_CHARGE_TYPE_DROP:
/* See mem_cgroup_prepare_migration() */
@@ -2897,7 +2907,7 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype)
break;
}
- mem_cgroup_charge_statistics(memcg, PageCgroupCache(pc), -nr_pages);
+ mem_cgroup_charge_statistics(memcg, file, -nr_pages);
ClearPageCgroupUsed(pc);
/*
@@ -2938,9 +2948,13 @@ void mem_cgroup_uncharge_page(struct page *page)
void mem_cgroup_uncharge_cache_page(struct page *page)
{
+ int ctype = MEM_CGROUP_CHARGE_TYPE_CACHE;
VM_BUG_ON(page_mapped(page));
VM_BUG_ON(page->mapping);
- __mem_cgroup_uncharge_common(page, MEM_CGROUP_CHARGE_TYPE_CACHE);
+
+ if (page_is_file_cache(page))
+ ctype = MEM_CGROUP_CHARGE_TYPE_SHMEM;
+ __mem_cgroup_uncharge_common(page, ctype);
}
/*
@@ -3276,7 +3290,7 @@ void mem_cgroup_replace_page_cache(struct page *oldpage,
/* fix accounting on old pages */
lock_page_cgroup(pc);
memcg = pc->mem_cgroup;
- mem_cgroup_charge_statistics(memcg, PageCgroupCache(pc), -1);
+ mem_cgroup_charge_statistics(memcg, true, -1);
ClearPageCgroupUsed(pc);
unlock_page_cgroup(pc);
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC][PATCH 4/5] memcg: remove PCG_CACHE bit
2011-12-15 12:04 ` KAMEZAWA Hiroyuki
@ 2011-12-15 14:54 ` Johannes Weiner
0 siblings, 0 replies; 13+ messages in thread
From: Johannes Weiner @ 2011-12-15 14:54 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm, akpm, Michal Hocko, Balbir Singh, Hugh Dickins,
Ying Han, nishimura
On Thu, Dec 15, 2011 at 09:04:06PM +0900, KAMEZAWA Hiroyuki wrote:
> On Thu, 15 Dec 2011 19:36:31 +0900
> KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
>
> > On Thu, 15 Dec 2011 11:24:42 +0100
> > Johannes Weiner <hannes@cmpxchg.org> wrote:
>
> > > What I think is required is to break up the charging and committing
> > > like we do for swap cache already:
> > >
> > > if (!mem_cgroup_try_charge())
> > > goto error;
> > > page_add_new_anon_rmap()
> > > mem_cgroup_commit()
> > >
> > > This will also allow us to even get rid of passing around the charge
> > > type everywhere...
> > >
> >
> > Thank you. I'll look into.
> >
> > To be honest, I want to remove 'rss' and 'cache' counter ;(
> > This doesn't have much meanings after lru was splitted.
> >
>
> I'll use this version for test. This patch is under far deep stacks of
> unmerged patches, anyway.
Ok, makes sense. I can do the PCG_CACHE removal, btw, I have half the
patches sitting around anyway, just need to fix up huge_memory.c.
> @@ -2938,9 +2948,13 @@ void mem_cgroup_uncharge_page(struct page *page)
>
> void mem_cgroup_uncharge_cache_page(struct page *page)
> {
> + int ctype = MEM_CGROUP_CHARGE_TYPE_CACHE;
> VM_BUG_ON(page_mapped(page));
> VM_BUG_ON(page->mapping);
> - __mem_cgroup_uncharge_common(page, MEM_CGROUP_CHARGE_TYPE_CACHE);
> +
> + if (page_is_file_cache(page))
> + ctype = MEM_CGROUP_CHARGE_TYPE_SHMEM;
> + __mem_cgroup_uncharge_common(page, ctype);
I think this is missing a negation, but it doesn't matter because the
SHMEM and CACHE charge types are treated exactly the same way. I'll
send a patch series that removes it soon, there is more shmem related
things...
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2011-12-15 14:55 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-12-15 6:00 [Experimental] [PATCH 0/5] page_cgroup->flags diet KAMEZAWA Hiroyuki
2011-12-15 6:05 ` [RFC] [PATCH 1/5] memcg: simplify account moving check KAMEZAWA Hiroyuki
2011-12-15 10:04 ` Johannes Weiner
2011-12-15 10:17 ` KAMEZAWA Hiroyuki
2011-12-15 6:06 ` [RFC][PATCH 2/5] memcg: safer page stat updating KAMEZAWA Hiroyuki
2011-12-15 6:07 ` [RFC][PATCH 3/5] memcg: remove PCG_FILE_MAPPED KAMEZAWA Hiroyuki
2011-12-15 6:08 ` [RFC][PATCH 4/5] memcg: remove PCG_CACHE bit KAMEZAWA Hiroyuki
2011-12-15 10:24 ` Johannes Weiner
2011-12-15 10:36 ` KAMEZAWA Hiroyuki
2011-12-15 12:04 ` KAMEZAWA Hiroyuki
2011-12-15 14:54 ` Johannes Weiner
2011-12-15 6:10 ` [RFC] [PATCH 5/5] memcg: remove PCG_MOVE_LOCK KAMEZAWA Hiroyuki
2011-12-15 8:44 ` [Experimental] [PATCH 0/5] page_cgroup->flags diet KAMEZAWA Hiroyuki
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox