* [PATCH 0/4] fix THP and memcg issues v2.
@ 2011-01-18 2:06 KAMEZAWA Hiroyuki
2011-01-18 2:07 ` [PATCH 1/4] modify accounting function for supporting THP better KAMEZAWA Hiroyuki
` (3 more replies)
0 siblings, 4 replies; 6+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-01-18 2:06 UTC (permalink / raw)
To: akpm; +Cc: linux-kernel, linux-mm, hannes, nishimura, balbir, Greg Thelen
Now, when THP is enabled, memcg's counter goes wrong. Moreover, rmdir()
may not end. I fixed some races since v1.
This series is a fix for obviouse counter breakage. When you test,
CONFIG_TRANSPARENT_HUGEPAGE=y
CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
is appreciated. Tests should be done is:
# mount -t cgroup none /cgroup/memory -omemory
# mkdir /cgroup/memory/A
# mkdir /cgroup/memory/A/B
# run some programs under B.
# echo 0 > /cgroup/memory/A/B/memory.force_empty
and check B's memory.stat shows RSS/CACHE/LRU are all 0.
Moving tasks while running is another good test.
I know there are another problem when memory cgroup hits limit and
reclaim in busy. But I will fix it in another patch.
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/4] modify accounting function for supporting THP better
2011-01-18 2:06 [PATCH 0/4] fix THP and memcg issues v2 KAMEZAWA Hiroyuki
@ 2011-01-18 2:07 ` KAMEZAWA Hiroyuki
2011-01-18 2:10 ` [PATCH 2/4] memcg: fixe USED bit handling at uncharge with THP KAMEZAWA Hiroyuki
` (2 subsequent siblings)
3 siblings, 0 replies; 6+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-01-18 2:07 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: akpm, linux-kernel, linux-mm, hannes, nishimura, balbir, Greg Thelen
mem_cgroup_charge_statisics() was designed for charging a page but
now, we have transparent hugepage. To fix problems (in following patch)
it's required to change the function to get the number of pages
as its arguments.
The new function gets following as argument.
- type of page rather than 'pc'
- size of page which is accounted.
Changelog:
- use usual variable names for args.
- removed unnecessary variable.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
mm/memcontrol.c | 25 ++++++++++++-------------
1 file changed, 12 insertions(+), 13 deletions(-)
Index: mmotm-0107/mm/memcontrol.c
===================================================================
--- mmotm-0107.orig/mm/memcontrol.c
+++ mmotm-0107/mm/memcontrol.c
@@ -600,23 +600,22 @@ static void mem_cgroup_swap_statistics(s
}
static void mem_cgroup_charge_statistics(struct mem_cgroup *mem,
- struct page_cgroup *pc,
- bool charge)
+ bool file, int nr_pages)
{
- int val = (charge) ? 1 : -1;
-
preempt_disable();
- if (PageCgroupCache(pc))
- __this_cpu_add(mem->stat->count[MEM_CGROUP_STAT_CACHE], val);
+ if (file)
+ __this_cpu_add(mem->stat->count[MEM_CGROUP_STAT_CACHE], nr_pages);
else
- __this_cpu_add(mem->stat->count[MEM_CGROUP_STAT_RSS], val);
+ __this_cpu_add(mem->stat->count[MEM_CGROUP_STAT_RSS], nr_pages);
- if (charge)
+ /* pagein of a big page is an event. So, ignore page size */
+ if (nr_pages > 0)
__this_cpu_inc(mem->stat->count[MEM_CGROUP_STAT_PGPGIN_COUNT]);
else
__this_cpu_inc(mem->stat->count[MEM_CGROUP_STAT_PGPGOUT_COUNT]);
- __this_cpu_inc(mem->stat->count[MEM_CGROUP_EVENTS]);
+
+ __this_cpu_add(mem->stat->count[MEM_CGROUP_EVENTS], nr_pages);
preempt_enable();
}
@@ -2115,7 +2114,7 @@ static void ____mem_cgroup_commit_charge
break;
}
- mem_cgroup_charge_statistics(mem, pc, true);
+ mem_cgroup_charge_statistics(mem, PageCgroupCache(pc), 1);
}
static void __mem_cgroup_commit_charge(struct mem_cgroup *mem,
@@ -2186,14 +2185,14 @@ static void __mem_cgroup_move_account(st
__this_cpu_inc(to->stat->count[MEM_CGROUP_STAT_FILE_MAPPED]);
preempt_enable();
}
- mem_cgroup_charge_statistics(from, pc, false);
+ mem_cgroup_charge_statistics(from, PageCgroupCache(pc), -1);
if (uncharge)
/* This is not "cancel", but cancel_charge does all we need. */
mem_cgroup_cancel_charge(from, PAGE_SIZE);
/* caller should have done css_get */
pc->mem_cgroup = to;
- mem_cgroup_charge_statistics(to, pc, true);
+ mem_cgroup_charge_statistics(to, PageCgroupCache(pc), 1);
/*
* We charges against "to" which may not have any tasks. Then, "to"
* can be under rmdir(). But in current implementation, caller of
@@ -2597,7 +2596,7 @@ __mem_cgroup_uncharge_common(struct page
}
for (i = 0; i < count; i++)
- mem_cgroup_charge_statistics(mem, pc + i, false);
+ mem_cgroup_charge_statistics(mem, PageCgroupCache(pc), -1);
ClearPageCgroupUsed(pc);
/*
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 2/4] memcg: fixe USED bit handling at uncharge with THP
2011-01-18 2:06 [PATCH 0/4] fix THP and memcg issues v2 KAMEZAWA Hiroyuki
2011-01-18 2:07 ` [PATCH 1/4] modify accounting function for supporting THP better KAMEZAWA Hiroyuki
@ 2011-01-18 2:10 ` KAMEZAWA Hiroyuki
2011-01-18 2:15 ` [PATCH 4/4] memcg: fix rmdir, force_empty " KAMEZAWA Hiroyuki
2011-01-18 2:14 ` [PATCH 3/4] memcg: fix LRU accounting " KAMEZAWA Hiroyuki
2011-01-18 2:31 ` [PATCH 0/4] fix THP and memcg issues v2 KAMEZAWA Hiroyuki
3 siblings, 1 reply; 6+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-01-18 2:10 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: akpm, linux-kernel, linux-mm, hannes, nishimura, balbir, Greg Thelen
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Now, under THP:
at charge:
- PageCgroupUsed bit is set to all page_cgroup on a hugepage.
....set to 512 pages.
at uncharge
- PageCgroupUsed bit is unset on the head page.
So, some pages will remain with "Used" bit.
This patch fixes that Used bit is set only to the head page.
Used bits for tail pages will be set at splitting if necessary.
This patch adds this lock order:
zone->lru_lock-> compound_lock() -> page_cgroup_move_lock().
Changelog:
- fixed some styles.
- removed BUG_ON()
- fixed mem_cgroup_update_page_stat() v.s. splitting races.
To avoid races, PCG_MOVE_LOCK is used.
- copy all required flags at splitting.
- add a dummy function for compile.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
include/linux/memcontrol.h | 8 +++
mm/huge_memory.c | 2
mm/memcontrol.c | 91 +++++++++++++++++++++++++--------------------
3 files changed, 61 insertions(+), 40 deletions(-)
Index: mmotm-0107/mm/memcontrol.c
===================================================================
--- mmotm-0107.orig/mm/memcontrol.c
+++ mmotm-0107/mm/memcontrol.c
@@ -1614,7 +1614,7 @@ void mem_cgroup_update_page_stat(struct
if (unlikely(!mem || !PageCgroupUsed(pc)))
goto out;
/* pc->mem_cgroup is unstable ? */
- if (unlikely(mem_cgroup_stealed(mem))) {
+ if (unlikely(mem_cgroup_stealed(mem)) || PageTransHuge(page)) {
/* take a lock against to access pc->mem_cgroup */
move_lock_page_cgroup(pc, &flags);
need_unlock = true;
@@ -2083,14 +2083,27 @@ struct mem_cgroup *try_get_mem_cgroup_fr
return mem;
}
-/*
- * commit a charge got by __mem_cgroup_try_charge() and makes page_cgroup to be
- * USED state. If already USED, uncharge and return.
- */
-static void ____mem_cgroup_commit_charge(struct mem_cgroup *mem,
- struct page_cgroup *pc,
- enum charge_type ctype)
+static void __mem_cgroup_commit_charge(struct mem_cgroup *mem,
+ struct page_cgroup *pc,
+ enum charge_type ctype,
+ int page_size)
{
+ int nr_pages = page_size >> PAGE_SHIFT;
+
+ /* try_charge() can return NULL to *memcg, taking care of it. */
+ if (!mem)
+ return;
+
+ lock_page_cgroup(pc);
+ if (unlikely(PageCgroupUsed(pc))) {
+ unlock_page_cgroup(pc);
+ mem_cgroup_cancel_charge(mem, page_size);
+ return;
+ }
+ /*
+ * we don't need page_cgroup_lock about tail pages, becase they are not
+ * accessed by any other context at this point.
+ */
pc->mem_cgroup = mem;
/*
* We access a page_cgroup asynchronously without lock_page_cgroup().
@@ -2114,35 +2127,7 @@ static void ____mem_cgroup_commit_charge
break;
}
- mem_cgroup_charge_statistics(mem, PageCgroupCache(pc), 1);
-}
-
-static void __mem_cgroup_commit_charge(struct mem_cgroup *mem,
- struct page_cgroup *pc,
- enum charge_type ctype,
- int page_size)
-{
- int i;
- int count = page_size >> PAGE_SHIFT;
-
- /* try_charge() can return NULL to *memcg, taking care of it. */
- if (!mem)
- return;
-
- lock_page_cgroup(pc);
- if (unlikely(PageCgroupUsed(pc))) {
- unlock_page_cgroup(pc);
- mem_cgroup_cancel_charge(mem, page_size);
- return;
- }
-
- /*
- * we don't need page_cgroup_lock about tail pages, becase they are not
- * accessed by any other context at this point.
- */
- for (i = 0; i < count; i++)
- ____mem_cgroup_commit_charge(mem, pc + i, ctype);
-
+ mem_cgroup_charge_statistics(mem, PageCgroupCache(pc), nr_pages);
unlock_page_cgroup(pc);
/*
* "charge_statistics" updated event counter. Then, check it.
@@ -2152,6 +2137,34 @@ static void __mem_cgroup_commit_charge(s
memcg_check_events(mem, pc->page);
}
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+
+#define PCGF_NOCOPY_AT_SPLIT ((1 << PCG_LOCK) | (1 << PCG_MOVE_LOCK) |\
+ (1 << PCG_MIGRATION))
+/*
+ * Because tail pages are not marked as "used", set it. We're under
+ * 'splitting' and compund_lock. 'splitting' ensures that the pages
+ */
+void mem_cgroup_split_huge_fixup(struct page *head, struct page *tail)
+{
+ struct page_cgroup *head_pc = lookup_page_cgroup(head);
+ struct page_cgroup *tail_pc = lookup_page_cgroup(tail);
+ unsigned long flags;
+
+ /*
+ * We have no races witch charge/uncharge but will have races with
+ * page state accounting.
+ */
+ move_lock_page_cgroup(head_pc, &flags);
+
+ tail_pc->mem_cgroup = head_pc->mem_cgroup;
+ smp_wmb(); /* see __commit_charge() */
+ /* we don't need to copy all flags...*/
+ tail_pc->flags = head_pc->flags & ~PCGF_NOCOPY_AT_SPLIT;
+ move_unlock_page_cgroup(head_pc, &flags);
+}
+#endif
+
/**
* __mem_cgroup_move_account - move account of the page
* @pc: page_cgroup of the page.
@@ -2545,7 +2558,6 @@ direct_uncharge:
static struct mem_cgroup *
__mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype)
{
- int i;
int count;
struct page_cgroup *pc;
struct mem_cgroup *mem = NULL;
@@ -2595,8 +2607,7 @@ __mem_cgroup_uncharge_common(struct page
break;
}
- for (i = 0; i < count; i++)
- mem_cgroup_charge_statistics(mem, PageCgroupCache(pc), -1);
+ mem_cgroup_charge_statistics(mem, PageCgroupCache(pc), -count);
ClearPageCgroupUsed(pc);
/*
Index: mmotm-0107/include/linux/memcontrol.h
===================================================================
--- mmotm-0107.orig/include/linux/memcontrol.h
+++ mmotm-0107/include/linux/memcontrol.h
@@ -146,6 +146,10 @@ unsigned long mem_cgroup_soft_limit_recl
gfp_t gfp_mask);
u64 mem_cgroup_get_limit(struct mem_cgroup *mem);
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+void mem_cgroup_split_huge_fixup(struct page *head, struct page *tail);
+#endif
+
#else /* CONFIG_CGROUP_MEM_RES_CTLR */
struct mem_cgroup;
@@ -335,6 +339,10 @@ u64 mem_cgroup_get_limit(struct mem_cgro
return 0;
}
+static inline mem_cgroup_split_huge_fixup(struct page *head, struct page *tail)
+{
+}
+
#endif /* CONFIG_CGROUP_MEM_CONT */
#endif /* _LINUX_MEMCONTROL_H */
Index: mmotm-0107/mm/huge_memory.c
===================================================================
--- mmotm-0107.orig/mm/huge_memory.c
+++ mmotm-0107/mm/huge_memory.c
@@ -1203,6 +1203,8 @@ static void __split_huge_page_refcount(s
BUG_ON(!PageDirty(page_tail));
BUG_ON(!PageSwapBacked(page_tail));
+ mem_cgroup_split_huge_fixup(page, page_tail);
+
lru_add_page_tail(zone, page, page_tail);
}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 3/4] memcg: fix LRU accounting with THP
2011-01-18 2:06 [PATCH 0/4] fix THP and memcg issues v2 KAMEZAWA Hiroyuki
2011-01-18 2:07 ` [PATCH 1/4] modify accounting function for supporting THP better KAMEZAWA Hiroyuki
2011-01-18 2:10 ` [PATCH 2/4] memcg: fixe USED bit handling at uncharge with THP KAMEZAWA Hiroyuki
@ 2011-01-18 2:14 ` KAMEZAWA Hiroyuki
2011-01-18 2:31 ` [PATCH 0/4] fix THP and memcg issues v2 KAMEZAWA Hiroyuki
3 siblings, 0 replies; 6+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-01-18 2:14 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: akpm, linux-kernel, linux-mm, hannes, nishimura, balbir, Greg Thelen
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
memory cgroup's LRU stat should take care of size of pages because
Transparent Hugepage inserts hugepage into LRU.
If this value is the number wrong, memory reclaim will not work well.
Note: only head page of THP's huge page is linked into LRU.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
mm/memcontrol.c | 26 ++++++++++++++++++++------
1 file changed, 20 insertions(+), 6 deletions(-)
Index: mmotm-0107/mm/memcontrol.c
===================================================================
--- mmotm-0107.orig/mm/memcontrol.c
+++ mmotm-0107/mm/memcontrol.c
@@ -814,7 +814,8 @@ void mem_cgroup_del_lru_list(struct page
* removed from global LRU.
*/
mz = page_cgroup_zoneinfo(pc);
- MEM_CGROUP_ZSTAT(mz, lru) -= 1;
+ /* huge page split is done under lru_lock. so, we have no races. */
+ MEM_CGROUP_ZSTAT(mz, lru) -= 1 << compound_order(page);
if (mem_cgroup_is_root(pc->mem_cgroup))
return;
VM_BUG_ON(list_empty(&pc->lru));
@@ -865,7 +866,8 @@ void mem_cgroup_add_lru_list(struct page
return;
mz = page_cgroup_zoneinfo(pc);
- MEM_CGROUP_ZSTAT(mz, lru) += 1;
+ /* huge page split is done under lru_lock. so, we have no races. */
+ MEM_CGROUP_ZSTAT(mz, lru) += 1 << compound_order(page);
SetPageCgroupAcctLRU(pc);
if (mem_cgroup_is_root(pc->mem_cgroup))
return;
@@ -2140,10 +2142,10 @@ static void __mem_cgroup_commit_charge(s
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
#define PCGF_NOCOPY_AT_SPLIT ((1 << PCG_LOCK) | (1 << PCG_MOVE_LOCK) |\
- (1 << PCG_MIGRATION))
+ (1 << PCG_ACCT_LRU) | (1 << PCG_MIGRATION))
/*
* Because tail pages are not marked as "used", set it. We're under
- * 'splitting' and compund_lock. 'splitting' ensures that the pages
+ * 'splitting' and compund_lock and zone->lru_lock.
*/
void mem_cgroup_split_huge_fixup(struct page *head, struct page *tail)
{
@@ -2152,14 +2154,26 @@ void mem_cgroup_split_huge_fixup(struct
unsigned long flags;
/*
- * We have no races witch charge/uncharge but will have races with
+ * We have no races with charge/uncharge but will have races with
* page state accounting.
*/
move_lock_page_cgroup(head_pc, &flags);
tail_pc->mem_cgroup = head_pc->mem_cgroup;
smp_wmb(); /* see __commit_charge() */
- /* we don't need to copy all flags...*/
+ if (PageCgroupAcctLRU(head_pc)) {
+ enum lru_list lru;
+ struct mem_cgroup_per_zone *mz;
+
+ /*
+ * LRU flags cannot be copied because we need to add tail
+ * page to LRU by generic call and our hook will be called.
+ * We hold lru_lock, then, reduce counter directly.
+ */
+ lru = page_lru(head);
+ mz = page_cgroup_zoneinfo(head_pc);
+ MEM_CGROUP_ZSTAT(mz, lru) -= 1;
+ }
tail_pc->flags = head_pc->flags & ~PCGF_NOCOPY_AT_SPLIT;
move_unlock_page_cgroup(head_pc, &flags);
}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 4/4] memcg: fix rmdir, force_empty with THP
2011-01-18 2:10 ` [PATCH 2/4] memcg: fixe USED bit handling at uncharge with THP KAMEZAWA Hiroyuki
@ 2011-01-18 2:15 ` KAMEZAWA Hiroyuki
0 siblings, 0 replies; 6+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-01-18 2:15 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: akpm, linux-kernel, linux-mm, hannes, nishimura, balbir, Greg Thelen
Now, when THP is enabled, memcg's rmdir() function is broken
because move_account() for THP page is not supported.
This will cause account leak or -EBUSY issue at rmdir().
This patch fixes the issue by supporting move_account() THP pages.
Changelog:
- style fix.
- add compound_lock for avoiding races.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
mm/memcontrol.c | 37 ++++++++++++++++++++++++++-----------
1 file changed, 26 insertions(+), 11 deletions(-)
Index: mmotm-0107/mm/memcontrol.c
===================================================================
--- mmotm-0107.orig/mm/memcontrol.c
+++ mmotm-0107/mm/memcontrol.c
@@ -2197,8 +2197,11 @@ void mem_cgroup_split_huge_fixup(struct
*/
static void __mem_cgroup_move_account(struct page_cgroup *pc,
- struct mem_cgroup *from, struct mem_cgroup *to, bool uncharge)
+ struct mem_cgroup *from, struct mem_cgroup *to, bool uncharge,
+ int charge_size)
{
+ int nr_pages = charge_size >> PAGE_SHIFT;
+
VM_BUG_ON(from == to);
VM_BUG_ON(PageLRU(pc->page));
VM_BUG_ON(!page_is_cgroup_locked(pc));
@@ -2212,14 +2215,14 @@ static void __mem_cgroup_move_account(st
__this_cpu_inc(to->stat->count[MEM_CGROUP_STAT_FILE_MAPPED]);
preempt_enable();
}
- mem_cgroup_charge_statistics(from, PageCgroupCache(pc), -1);
+ mem_cgroup_charge_statistics(from, PageCgroupCache(pc), -nr_pages);
if (uncharge)
/* This is not "cancel", but cancel_charge does all we need. */
- mem_cgroup_cancel_charge(from, PAGE_SIZE);
+ mem_cgroup_cancel_charge(from, charge_size);
/* caller should have done css_get */
pc->mem_cgroup = to;
- mem_cgroup_charge_statistics(to, PageCgroupCache(pc), 1);
+ mem_cgroup_charge_statistics(to, PageCgroupCache(pc), nr_pages);
/*
* We charges against "to" which may not have any tasks. Then, "to"
* can be under rmdir(). But in current implementation, caller of
@@ -2234,15 +2237,19 @@ static void __mem_cgroup_move_account(st
* __mem_cgroup_move_account()
*/
static int mem_cgroup_move_account(struct page_cgroup *pc,
- struct mem_cgroup *from, struct mem_cgroup *to, bool uncharge)
+ struct mem_cgroup *from, struct mem_cgroup *to,
+ bool uncharge, int charge_size)
{
int ret = -EINVAL;
unsigned long flags;
+ if ((charge_size > PAGE_SIZE) && !PageTransHuge(pc->page))
+ return -EBUSY;
+
lock_page_cgroup(pc);
if (PageCgroupUsed(pc) && pc->mem_cgroup == from) {
move_lock_page_cgroup(pc, &flags);
- __mem_cgroup_move_account(pc, from, to, uncharge);
+ __mem_cgroup_move_account(pc, from, to, uncharge, charge_size);
move_unlock_page_cgroup(pc, &flags);
ret = 0;
}
@@ -2267,6 +2274,8 @@ static int mem_cgroup_move_parent(struct
struct cgroup *cg = child->css.cgroup;
struct cgroup *pcg = cg->parent;
struct mem_cgroup *parent;
+ int charge = PAGE_SIZE;
+ unsigned long flags;
int ret;
/* Is ROOT ? */
@@ -2278,17 +2287,23 @@ static int mem_cgroup_move_parent(struct
goto out;
if (isolate_lru_page(page))
goto put;
+ /* The page is isolated from LRU and we have no race with splitting */
+ charge = PAGE_SIZE << compound_order(page);
parent = mem_cgroup_from_cont(pcg);
- ret = __mem_cgroup_try_charge(NULL, gfp_mask, &parent, false,
- PAGE_SIZE);
+ ret = __mem_cgroup_try_charge(NULL, gfp_mask, &parent, false, charge);
if (ret || !parent)
goto put_back;
- ret = mem_cgroup_move_account(pc, child, parent, true);
+ if (charge > PAGE_SIZE)
+ flags = compound_lock_irqsave(page);
+
+ ret = mem_cgroup_move_account(pc, child, parent, true, charge);
if (ret)
- mem_cgroup_cancel_charge(parent, PAGE_SIZE);
+ mem_cgroup_cancel_charge(parent, charge);
put_back:
+ if (charge > PAGE_SIZE)
+ compound_unlock_irqrestore(page, flags);
putback_lru_page(page);
put:
put_page(page);
@@ -4868,7 +4883,7 @@ retry:
goto put;
pc = lookup_page_cgroup(page);
if (!mem_cgroup_move_account(pc,
- mc.from, mc.to, false)) {
+ mc.from, mc.to, false, PAGE_SIZE)) {
mc.precharge--;
/* we uncharge from mc.from later. */
mc.moved_charge++;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 0/4] fix THP and memcg issues v2.
2011-01-18 2:06 [PATCH 0/4] fix THP and memcg issues v2 KAMEZAWA Hiroyuki
` (2 preceding siblings ...)
2011-01-18 2:14 ` [PATCH 3/4] memcg: fix LRU accounting " KAMEZAWA Hiroyuki
@ 2011-01-18 2:31 ` KAMEZAWA Hiroyuki
3 siblings, 0 replies; 6+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-01-18 2:31 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: akpm, linux-kernel, linux-mm, hannes, nishimura, balbir, Greg Thelen
On Tue, 18 Jan 2011 11:06:04 +0900
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
>
> Now, when THP is enabled, memcg's counter goes wrong. Moreover, rmdir()
> may not end. I fixed some races since v1.
>
>
> This series is a fix for obviouse counter breakage. When you test,
> CONFIG_TRANSPARENT_HUGEPAGE=y
> CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
>
> is appreciated. Tests should be done is:
>
> # mount -t cgroup none /cgroup/memory -omemory
> # mkdir /cgroup/memory/A
> # mkdir /cgroup/memory/A/B
> # run some programs under B.
> # echo 0 > /cgroup/memory/A/B/memory.force_empty
>
> and check B's memory.stat shows RSS/CACHE/LRU are all 0.
> Moving tasks while running is another good test.
>
> I know there are another problem when memory cgroup hits limit and
> reclaim in busy. But I will fix it in another patch.
>
I found some mistake of patch handling in patch 2/4 and 3/4.
(a code for 2/4 was in 3/4...)
I'll send v3. I'm sorry.
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2011-01-18 2:37 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-18 2:06 [PATCH 0/4] fix THP and memcg issues v2 KAMEZAWA Hiroyuki
2011-01-18 2:07 ` [PATCH 1/4] modify accounting function for supporting THP better KAMEZAWA Hiroyuki
2011-01-18 2:10 ` [PATCH 2/4] memcg: fixe USED bit handling at uncharge with THP KAMEZAWA Hiroyuki
2011-01-18 2:15 ` [PATCH 4/4] memcg: fix rmdir, force_empty " KAMEZAWA Hiroyuki
2011-01-18 2:14 ` [PATCH 3/4] memcg: fix LRU accounting " KAMEZAWA Hiroyuki
2011-01-18 2:31 ` [PATCH 0/4] fix THP and memcg issues v2 KAMEZAWA Hiroyuki
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox