* [PATCH 0/6 v4] memcg: page cgroup diet
@ 2012-02-14 3:04 KAMEZAWA Hiroyuki
2012-02-14 3:06 ` [PATCH 1/6 v4] memcg: remove EXPORT_SYMBOL(mem_cgroup_update_page_stat) KAMEZAWA Hiroyuki
` (5 more replies)
0 siblings, 6 replies; 16+ messages in thread
From: KAMEZAWA Hiroyuki @ 2012-02-14 3:04 UTC (permalink / raw)
To: linux-mm; +Cc: linux-kernel, hannes, Michal Hocko, akpm, Ying Han, Hugh Dickins
Here is v4. I removed RFC and fixed a fatal bug in v3.
This patch series is a preparetion for removing page_cgroup->flags. To remove
it, we need to reduce flags on page_cgroup. In this set, PCG_MOVE_LOCK and
PCG_FILE_MAPPED are remvoed.
After this, we only have 3 bits.
==
enum {
/* flags for mem_cgroup */
PCG_LOCK, /* Lock for pc->mem_cgroup and following bits. */
PCG_USED, /* this object is in use. */
PCG_MIGRATION, /* under page migration */
__NR_PCG_FLAGS,
};
==
I'll make a try to merge this 3bits to lower 3bits of pointer to
memcg. Then, we can make page_cgroup 8(4)bytes per page.
Major changes since v3 are
- replaced move_lock_page_cgroup() with move_lock_mem_cgroup().
passes pointer to memcg rather than page_cgroup.
Working on linux-next feb13 and passed several tests.
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 1/6 v4] memcg: remove EXPORT_SYMBOL(mem_cgroup_update_page_stat)
2012-02-14 3:04 [PATCH 0/6 v4] memcg: page cgroup diet KAMEZAWA Hiroyuki
@ 2012-02-14 3:06 ` KAMEZAWA Hiroyuki
2012-02-14 7:21 ` Greg Thelen
2012-02-14 3:07 ` [PATCH 2/6 v4] memcg: simplify move_account() check KAMEZAWA Hiroyuki
` (4 subsequent siblings)
5 siblings, 1 reply; 16+ messages in thread
From: KAMEZAWA Hiroyuki @ 2012-02-14 3:06 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm, linux-kernel, hannes, Michal Hocko, akpm, Ying Han,
Hugh Dickins
This is just a cleanup.
==
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Date: Thu, 2 Feb 2012 12:05:41 +0900
Subject: [PATCH 1/6] memcg: remove EXPORT_SYMBOL(mem_cgroup_update_page_stat)
>From the log, I guess EXPORT was for preparing dirty accounting.
But _now_, we don't need to export this. Remove this for now.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
mm/memcontrol.c | 1 -
1 files changed, 0 insertions(+), 1 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index ab315ab..4c2b759 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1897,7 +1897,6 @@ out:
move_unlock_page_cgroup(pc, &flags);
rcu_read_unlock();
}
-EXPORT_SYMBOL(mem_cgroup_update_page_stat);
/*
* size of first charge trial. "32" comes from vmscan.c's magic value.
--
1.7.4.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 2/6 v4] memcg: simplify move_account() check
2012-02-14 3:04 [PATCH 0/6 v4] memcg: page cgroup diet KAMEZAWA Hiroyuki
2012-02-14 3:06 ` [PATCH 1/6 v4] memcg: remove EXPORT_SYMBOL(mem_cgroup_update_page_stat) KAMEZAWA Hiroyuki
@ 2012-02-14 3:07 ` KAMEZAWA Hiroyuki
2012-02-14 7:21 ` Greg Thelen
2012-02-14 3:13 ` [PATCH 3/6 v4] memcg: remove PCG_MOVE_LOCK flag from page_cgroup KAMEZAWA Hiroyuki
` (3 subsequent siblings)
5 siblings, 1 reply; 16+ messages in thread
From: KAMEZAWA Hiroyuki @ 2012-02-14 3:07 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm, linux-kernel, hannes, Michal Hocko, akpm, Ying Han,
Hugh Dickins
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 3/6 v4] memcg: remove PCG_MOVE_LOCK flag from page_cgroup.
2012-02-14 3:04 [PATCH 0/6 v4] memcg: page cgroup diet KAMEZAWA Hiroyuki
2012-02-14 3:06 ` [PATCH 1/6 v4] memcg: remove EXPORT_SYMBOL(mem_cgroup_update_page_stat) KAMEZAWA Hiroyuki
2012-02-14 3:07 ` [PATCH 2/6 v4] memcg: simplify move_account() check KAMEZAWA Hiroyuki
@ 2012-02-14 3:13 ` KAMEZAWA Hiroyuki
2012-02-14 7:21 ` Greg Thelen
2012-02-14 3:14 ` [PATCH 4/6 v4] memcg: use new logic for page stat accounting KAMEZAWA Hiroyuki
` (2 subsequent siblings)
5 siblings, 1 reply; 16+ messages in thread
From: KAMEZAWA Hiroyuki @ 2012-02-14 3:13 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm, linux-kernel, hannes, Michal Hocko, akpm, Ying Han,
Hugh Dickins
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 4/6 v4] memcg: use new logic for page stat accounting
2012-02-14 3:04 [PATCH 0/6 v4] memcg: page cgroup diet KAMEZAWA Hiroyuki
` (2 preceding siblings ...)
2012-02-14 3:13 ` [PATCH 3/6 v4] memcg: remove PCG_MOVE_LOCK flag from page_cgroup KAMEZAWA Hiroyuki
@ 2012-02-14 3:14 ` KAMEZAWA Hiroyuki
2012-02-14 7:22 ` Greg Thelen
2012-02-14 3:15 ` [PATCH 5/6 v4] memcg: remove PCG_FILE_MAPPED KAMEZAWA Hiroyuki
2012-02-14 3:16 ` [PATCH 6/6 v4] memcg: fix performance of mem_cgroup_begin_update_page_stat() KAMEZAWA Hiroyuki
5 siblings, 1 reply; 16+ messages in thread
From: KAMEZAWA Hiroyuki @ 2012-02-14 3:14 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm, linux-kernel, hannes, Michal Hocko, akpm, Ying Han,
Hugh Dickins
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 5/6 v4] memcg: remove PCG_FILE_MAPPED
2012-02-14 3:04 [PATCH 0/6 v4] memcg: page cgroup diet KAMEZAWA Hiroyuki
` (3 preceding siblings ...)
2012-02-14 3:14 ` [PATCH 4/6 v4] memcg: use new logic for page stat accounting KAMEZAWA Hiroyuki
@ 2012-02-14 3:15 ` KAMEZAWA Hiroyuki
2012-02-14 7:22 ` Greg Thelen
2012-02-14 3:16 ` [PATCH 6/6 v4] memcg: fix performance of mem_cgroup_begin_update_page_stat() KAMEZAWA Hiroyuki
5 siblings, 1 reply; 16+ messages in thread
From: KAMEZAWA Hiroyuki @ 2012-02-14 3:15 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm, linux-kernel, hannes, Michal Hocko, akpm, Ying Han,
Hugh Dickins
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 6/6 v4] memcg: fix performance of mem_cgroup_begin_update_page_stat()
2012-02-14 3:04 [PATCH 0/6 v4] memcg: page cgroup diet KAMEZAWA Hiroyuki
` (4 preceding siblings ...)
2012-02-14 3:15 ` [PATCH 5/6 v4] memcg: remove PCG_FILE_MAPPED KAMEZAWA Hiroyuki
@ 2012-02-14 3:16 ` KAMEZAWA Hiroyuki
2012-02-14 7:22 ` Greg Thelen
5 siblings, 1 reply; 16+ messages in thread
From: KAMEZAWA Hiroyuki @ 2012-02-14 3:16 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm, linux-kernel, hannes, Michal Hocko, akpm, Ying Han,
Hugh Dickins
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/6 v4] memcg: remove EXPORT_SYMBOL(mem_cgroup_update_page_stat)
2012-02-14 3:06 ` [PATCH 1/6 v4] memcg: remove EXPORT_SYMBOL(mem_cgroup_update_page_stat) KAMEZAWA Hiroyuki
@ 2012-02-14 7:21 ` Greg Thelen
0 siblings, 0 replies; 16+ messages in thread
From: Greg Thelen @ 2012-02-14 7:21 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm, linux-kernel, hannes, Michal Hocko, akpm, Ying Han,
Hugh Dickins
On Mon, Feb 13, 2012 at 7:06 PM, KAMEZAWA Hiroyuki
<kamezawa.hiroyu@jp.fujitsu.com> wrote:
>
> This is just a cleanup.
> ==
> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Date: Thu, 2 Feb 2012 12:05:41 +0900
> Subject: [PATCH 1/6] memcg: remove EXPORT_SYMBOL(mem_cgroup_update_page_stat)
>
> From the log, I guess EXPORT was for preparing dirty accounting.
> But _now_, we don't need to export this. Remove this for now.
>
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Looks good to me.
Reviewed-by: Greg Thelen <gthelen@google.com>
> ---
> mm/memcontrol.c | 1 -
> 1 files changed, 0 insertions(+), 1 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index ab315ab..4c2b759 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -1897,7 +1897,6 @@ out:
> move_unlock_page_cgroup(pc, &flags);
> rcu_read_unlock();
> }
> -EXPORT_SYMBOL(mem_cgroup_update_page_stat);
>
> /*
> * size of first charge trial. "32" comes from vmscan.c's magic value.
> --
> 1.7.4.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 2/6 v4] memcg: simplify move_account() check
2012-02-14 3:07 ` [PATCH 2/6 v4] memcg: simplify move_account() check KAMEZAWA Hiroyuki
@ 2012-02-14 7:21 ` Greg Thelen
2012-02-14 8:44 ` KAMEZAWA Hiroyuki
0 siblings, 1 reply; 16+ messages in thread
From: Greg Thelen @ 2012-02-14 7:21 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm, linux-kernel, hannes, Michal Hocko, akpm, Ying Han,
Hugh Dickins
On Mon, Feb 13, 2012 at 7:07 PM, KAMEZAWA Hiroyuki
<kamezawa.hiroyu@jp.fujitsu.com> wrote:
> From 9cdb3b63dc8d08cc2220c54c80438c13433a0d12 Mon Sep 17 00:00:00 2001
> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Date: Thu, 2 Feb 2012 10:02:39 +0900
> Subject: [PATCH 2/6] memcg: simplify move_account() check.
>
> In memcg, for avoiding take-lock-irq-off at accessing page_cgroup,
> a logic, flag + rcu_read_lock(), is used. This works as following
>
> CPU-A CPU-B
> rcu_read_lock()
> set flag
> if(flag is set)
> take heavy lock
> do job.
> synchronize_rcu() rcu_read_unlock()
I assume that CPU-A will take heavy lock after synchronize_rcu() when
updating variables read by CPU-B.
> memcontrol.c | 65 ++++++++++++++++++++++-------------------------------------
> 1 file changed, 25 insertions(+), 40 deletions(-)
>
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Greg Thelen <gthelen@google.com>
> ---
> mm/memcontrol.c | 70 +++++++++++++++++++++++-------------------------------
> 1 files changed, 30 insertions(+), 40 deletions(-)
> @@ -2089,11 +2082,8 @@ static int __cpuinit memcg_cpu_hotplug_callback(struct notifier_block *nb,
> struct memcg_stock_pcp *stock;
> struct mem_cgroup *iter;
>
> - if ((action == CPU_ONLINE)) {
> - for_each_mem_cgroup(iter)
> - synchronize_mem_cgroup_on_move(iter, cpu);
> + if ((action == CPU_ONLINE))
Extra parenthesis. I recommend:
+ if (action == CPU_ONLINE)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/6 v4] memcg: remove PCG_MOVE_LOCK flag from page_cgroup.
2012-02-14 3:13 ` [PATCH 3/6 v4] memcg: remove PCG_MOVE_LOCK flag from page_cgroup KAMEZAWA Hiroyuki
@ 2012-02-14 7:21 ` Greg Thelen
0 siblings, 0 replies; 16+ messages in thread
From: Greg Thelen @ 2012-02-14 7:21 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm, linux-kernel, hannes, Michal Hocko, akpm, Ying Han,
Hugh Dickins
On Mon, Feb 13, 2012 at 7:13 PM, KAMEZAWA Hiroyuki
<kamezawa.hiroyu@jp.fujitsu.com> wrote:
> From ffd1b013fe294a80c12e3f30e85386135a6fb284 Mon Sep 17 00:00:00 2001
> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Date: Thu, 2 Feb 2012 11:49:59 +0900
> Subject: [PATCH 3/6] memcg: remove PCG_MOVE_LOCK flag from page_cgroup.
>
> PCG_MOVE_LOCK is used for bit spinlock to avoid race between overwriting
> pc->mem_cgroup and page statistics accounting per memcg.
> This lock helps to avoid the race but the race is very rare because moving
> tasks between cgroup is not a usual job.
> So, it seems using 1bit per page is too costly.
>
> This patch changes this lock as per-memcg spinlock and removes PCG_MOVE_LOCK.
>
> If smaller lock is required, we'll be able to add some hashes but
> I'd like to start from this.
>
> Changelog:
> - fixed to pass memcg as an argument rather than page_cgroup.
> and renamed from move_lock_page_cgroup() to move_lock_mem_cgroup()
>
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Seems good. Thanks.
Acked-by: Greg Thelen <gthelen@google.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 4/6 v4] memcg: use new logic for page stat accounting
2012-02-14 3:14 ` [PATCH 4/6 v4] memcg: use new logic for page stat accounting KAMEZAWA Hiroyuki
@ 2012-02-14 7:22 ` Greg Thelen
2012-02-14 8:43 ` KAMEZAWA Hiroyuki
0 siblings, 1 reply; 16+ messages in thread
From: Greg Thelen @ 2012-02-14 7:22 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm, linux-kernel, hannes, Michal Hocko, akpm, Ying Han,
Hugh Dickins
On Mon, Feb 13, 2012 at 7:14 PM, KAMEZAWA Hiroyuki
<kamezawa.hiroyu@jp.fujitsu.com> wrote:
> From ad2905362ef58a44d96a325193ab384739418050 Mon Sep 17 00:00:00 2001
> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Date: Thu, 2 Feb 2012 11:49:59 +0900
> Subject: [PATCH 4/6] memcg: use new logic for page stat accounting.
>
> Now, page-stat-per-memcg is recorded into per page_cgroup flag by
> duplicating page's status into the flag. The reason is that memcg
> has a feature to move a page from a group to another group and we
> have race between "move" and "page stat accounting",
>
> Under current logic, assume CPU-A and CPU-B. CPU-A does "move"
> and CPU-B does "page stat accounting".
>
> When CPU-A goes 1st,
>
> CPU-A CPU-B
> update "struct page" info.
> move_lock_mem_cgroup(memcg)
> see flags
pc->flags?
> copy page stat to new group
> overwrite pc->mem_cgroup.
> move_unlock_mem_cgroup(memcg)
> move_lock_mem_cgroup(mem)
> set pc->flags
> update page stat accounting
> move_unlock_mem_cgroup(mem)
>
> stat accounting is guarded by move_lock_mem_cgroup() and "move"
> logic (CPU-A) doesn't see changes in "struct page" information.
>
> But it's costly to have the same information both in 'struct page' and
> 'struct page_cgroup'. And, there is a potential problem.
>
> For example, assume we have PG_dirty accounting in memcg.
> PG_..is a flag for struct page.
> PCG_ is a flag for struct page_cgroup.
> (This is just an example. The same problem can be found in any
> kind of page stat accounting.)
>
> CPU-A CPU-B
> TestSet PG_dirty
> (delay) TestClear PG_dirty_
PG_dirty
> if (TestClear(PCG_dirty))
> memcg->nr_dirty--
> if (TestSet(PCG_dirty))
> memcg->nr_dirty++
>
> @@ -141,6 +141,31 @@ static inline bool mem_cgroup_disabled(void)
> return false;
> }
>
> +void __mem_cgroup_begin_update_page_stat(struct page *page,
> + bool *lock, unsigned long *flags);
> +
> +static inline void mem_cgroup_begin_update_page_stat(struct page *page,
> + bool *lock, unsigned long *flags)
> +{
> + if (mem_cgroup_disabled())
> + return;
> + rcu_read_lock();
> + *lock = false;
This seems like a strange place to set *lock=false. I think it's
clearer if __mem_cgroup_begin_update_page_stat() is the only routine
that sets or clears *lock. But I do see that in patch 6/6 'memcg: fix
performance of mem_cgroup_begin_update_page_stat()' this position is
required.
> + return __mem_cgroup_begin_update_page_stat(page, lock, flags);
> +}
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index ecf8856..30afea5 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -1877,32 +1877,54 @@ bool mem_cgroup_handle_oom(struct mem_cgroup *memcg, gfp_t mask)
> * If there is, we take a lock.
> */
>
> +void __mem_cgroup_begin_update_page_stat(struct page *page,
> + bool *lock, unsigned long *flags)
> +{
> + struct mem_cgroup *memcg;
> + struct page_cgroup *pc;
> +
> + pc = lookup_page_cgroup(page);
> +again:
> + memcg = pc->mem_cgroup;
> + if (unlikely(!memcg || !PageCgroupUsed(pc)))
> + return;
> + if (!mem_cgroup_stealed(memcg))
> + return;
> +
> + move_lock_mem_cgroup(memcg, flags);
> + if (memcg != pc->mem_cgroup || !PageCgroupUsed(pc)) {
> + move_unlock_mem_cgroup(memcg, flags);
> + goto again;
> + }
> + *lock = true;
> +}
> +
> +void __mem_cgroup_end_update_page_stat(struct page *page,
> + bool *lock, unsigned long *flags)
'lock' looks like an unused parameter. If so, then remove it.
> +{
> + struct page_cgroup *pc = lookup_page_cgroup(page);
> +
> + /*
> + * It's guaranteed that pc->mem_cgroup never changes while
> + * lock is held
Please continue comment describing what provides this guarantee. I
assume it is because rcu_read_lock() is held by
mem_cgroup_begin_update_page_stat(). Maybe it's best to to just make
small reference to the locking protocol description in
mem_cgroup_start_move().
> + */
> + move_unlock_mem_cgroup(pc->mem_cgroup, flags);
> +}
> +
> +
I think it would be useful to add a small comment here declaring that
all callers of this routine must be in a
mem_cgroup_begin_update_page_stat(), mem_cgroup_end_update_page_stat()
critical section to keep pc->mem_cgroup stable.
> void mem_cgroup_update_page_stat(struct page *page,
> enum mem_cgroup_page_stat_item idx, int val)
> {
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 5/6 v4] memcg: remove PCG_FILE_MAPPED
2012-02-14 3:15 ` [PATCH 5/6 v4] memcg: remove PCG_FILE_MAPPED KAMEZAWA Hiroyuki
@ 2012-02-14 7:22 ` Greg Thelen
0 siblings, 0 replies; 16+ messages in thread
From: Greg Thelen @ 2012-02-14 7:22 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm, linux-kernel, hannes, Michal Hocko, akpm, Ying Han,
Hugh Dickins
On Mon, Feb 13, 2012 at 7:15 PM, KAMEZAWA Hiroyuki
<kamezawa.hiroyu@jp.fujitsu.com> wrote:
> From 96407a510d5212179a4768f28591b35d5c17d403 Mon Sep 17 00:00:00 2001
> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Date: Thu, 2 Feb 2012 15:02:18 +0900
> Subject: [PATCH 5/6] memcg: remove PCG_FILE_MAPPED
>
> with new lock scheme for updating memcg's page stat, we don't need
> a flag PCG_FILE_MAPPED which was duplicated information of page_mapped().
>
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Thanks!
Acked-by: Greg Thelen <gthelen@google.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 6/6 v4] memcg: fix performance of mem_cgroup_begin_update_page_stat()
2012-02-14 3:16 ` [PATCH 6/6 v4] memcg: fix performance of mem_cgroup_begin_update_page_stat() KAMEZAWA Hiroyuki
@ 2012-02-14 7:22 ` Greg Thelen
2012-02-14 11:08 ` KAMEZAWA Hiroyuki
0 siblings, 1 reply; 16+ messages in thread
From: Greg Thelen @ 2012-02-14 7:22 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm, linux-kernel, hannes, Michal Hocko, akpm, Ying Han,
Hugh Dickins
On Mon, Feb 13, 2012 at 7:16 PM, KAMEZAWA Hiroyuki
<kamezawa.hiroyu@jp.fujitsu.com> wrote:
> From 3377fd7b6e23a5d2a368c078eae27e2b49c4f4aa Mon Sep 17 00:00:00 2001
> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Date: Mon, 6 Feb 2012 12:14:47 +0900
> Subject: [PATCH 6/6] memcg: fix performance of mem_cgroup_begin_update_page_stat()
>
> mem_cgroup_begin_update_page_stat() should be very fast because
> it's called very frequently. Now, it needs to look up page_cgroup
> and its memcg....this is slow.
>
> This patch adds a global variable to check "a memcg is moving or not".
s/a memcg/any memcg/
> By this, the caller doesn't need to visit page_cgroup and memcg.
s/By/With/
> Here is a test result. A test program makes page faults onto a file,
> MAP_SHARED and makes each page's page_mapcount(page) > 1, and free
> the range by madvise() and page fault again. This program causes
> 26214400 times of page fault onto a file(size was 1G.) and shows
> shows the cost of mem_cgroup_begin_update_page_stat().
Out of curiosity, what is the performance of the mmap program before
this series?
> Before this patch for mem_cgroup_begin_update_page_stat()
> [kamezawa@bluextal test]$ time ./mmap 1G
>
> real 0m21.765s
> user 0m5.999s
> sys 0m15.434s
>
> 27.46% mmap mmap [.] reader
> 21.15% mmap [kernel.kallsyms] [k] page_fault
> 9.17% mmap [kernel.kallsyms] [k] filemap_fault
> 2.96% mmap [kernel.kallsyms] [k] __do_fault
> 2.83% mmap [kernel.kallsyms] [k] __mem_cgroup_begin_update_page_stat
>
> After this patch
> [root@bluextal test]# time ./mmap 1G
>
> real 0m21.373s
> user 0m6.113s
> sys 0m15.016s
>
> In usual path, calls to __mem_cgroup_begin_update_page_stat() goes away.
>
> Note: we may be able to remove this optimization in future if
> we can get pointer to memcg directly from struct page.
>
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Greg Thelen <gthelen@google.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 4/6 v4] memcg: use new logic for page stat accounting
2012-02-14 7:22 ` Greg Thelen
@ 2012-02-14 8:43 ` KAMEZAWA Hiroyuki
0 siblings, 0 replies; 16+ messages in thread
From: KAMEZAWA Hiroyuki @ 2012-02-14 8:43 UTC (permalink / raw)
To: Greg Thelen
Cc: linux-mm, linux-kernel, hannes, Michal Hocko, akpm, Ying Han,
Hugh Dickins
On Mon, 13 Feb 2012 23:22:22 -0800
Greg Thelen <gthelen@google.com> wrote:
> On Mon, Feb 13, 2012 at 7:14 PM, KAMEZAWA Hiroyuki
> <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> > From ad2905362ef58a44d96a325193ab384739418050 Mon Sep 17 00:00:00 2001
> > From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> > Date: Thu, 2 Feb 2012 11:49:59 +0900
> > Subject: [PATCH 4/6] memcg: use new logic for page stat accounting.
> >
> > Now, page-stat-per-memcg is recorded into per page_cgroup flag by
> > duplicating page's status into the flag. The reason is that memcg
> > has a feature to move a page from a group to another group and we
> > have race between "move" and "page stat accounting",
> >
> > Under current logic, assume CPU-A and CPU-B. CPU-A does "move"
> > and CPU-B does "page stat accounting".
> >
> > When CPU-A goes 1st,
> >
> > A A A A A A CPU-A A A A A A A A A A A A A A CPU-B
> > A A A A A A A A A A A A A A A A A A update "struct page" info.
> > A A move_lock_mem_cgroup(memcg)
> > A A see flags
>
> pc->flags?
>
yes.
> > A A copy page stat to new group
> > A A overwrite pc->mem_cgroup.
> > A A move_unlock_mem_cgroup(memcg)
> > A A A A A A A A A A A A A A A A A A move_lock_mem_cgroup(mem)
> > A A A A A A A A A A A A A A A A A A set pc->flags
> > A A A A A A A A A A A A A A A A A A update page stat accounting
> > A A A A A A A A A A A A A A A A A A move_unlock_mem_cgroup(mem)
> >
> > stat accounting is guarded by move_lock_mem_cgroup() and "move"
> > logic (CPU-A) doesn't see changes in "struct page" information.
> >
> > But it's costly to have the same information both in 'struct page' and
> > 'struct page_cgroup'. And, there is a potential problem.
> >
> > For example, assume we have PG_dirty accounting in memcg.
> > PG_..is a flag for struct page.
> > PCG_ is a flag for struct page_cgroup.
> > (This is just an example. The same problem can be found in any
> > A kind of page stat accounting.)
> >
> > A A A A A CPU-A A A A A A A A A A A A A A A A CPU-B
> > A A A TestSet PG_dirty
> > A A A (delay) A A A A A A A A A A A A TestClear PG_dirty_
>
> PG_dirty
>
> > A A A A A A A A A A A A A A A A A A if (TestClear(PCG_dirty))
> > A A A A A A A A A A A A A A A A A A A A A memcg->nr_dirty--
> > A A A if (TestSet(PCG_dirty))
> > A A A A A memcg->nr_dirty++
> >
>
> > @@ -141,6 +141,31 @@ static inline bool mem_cgroup_disabled(void)
> > A A A A return false;
> > A }
> >
> > +void __mem_cgroup_begin_update_page_stat(struct page *page,
> > + A A A A A A A A A A A A A A A A A A A bool *lock, unsigned long *flags);
> > +
> > +static inline void mem_cgroup_begin_update_page_stat(struct page *page,
> > + A A A A A A A A A A A A A A A A A A A bool *lock, unsigned long *flags)
> > +{
> > + A A A if (mem_cgroup_disabled())
> > + A A A A A A A return;
> > + A A A rcu_read_lock();
> > + A A A *lock = false;
>
> This seems like a strange place to set *lock=false. I think it's
> clearer if __mem_cgroup_begin_update_page_stat() is the only routine
> that sets or clears *lock. But I do see that in patch 6/6 'memcg: fix
> performance of mem_cgroup_begin_update_page_stat()' this position is
> required.
>
Ah, yes. Hmm, it was better to move this to the body of function.
> > + A A A return __mem_cgroup_begin_update_page_stat(page, lock, flags);
> > +}
>
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index ecf8856..30afea5 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -1877,32 +1877,54 @@ bool mem_cgroup_handle_oom(struct mem_cgroup *memcg, gfp_t mask)
> > A * If there is, we take a lock.
> > A */
> >
> > +void __mem_cgroup_begin_update_page_stat(struct page *page,
> > + A A A A A A A A A A A A A A A bool *lock, unsigned long *flags)
> > +{
> > + A A A struct mem_cgroup *memcg;
> > + A A A struct page_cgroup *pc;
> > +
> > + A A A pc = lookup_page_cgroup(page);
> > +again:
> > + A A A memcg = pc->mem_cgroup;
> > + A A A if (unlikely(!memcg || !PageCgroupUsed(pc)))
> > + A A A A A A A return;
> > + A A A if (!mem_cgroup_stealed(memcg))
> > + A A A A A A A return;
> > +
> > + A A A move_lock_mem_cgroup(memcg, flags);
> > + A A A if (memcg != pc->mem_cgroup || !PageCgroupUsed(pc)) {
> > + A A A A A A A move_unlock_mem_cgroup(memcg, flags);
> > + A A A A A A A goto again;
> > + A A A }
> > + A A A *lock = true;
> > +}
> > +
> > +void __mem_cgroup_end_update_page_stat(struct page *page,
> > + A A A A A A A A A A A A A A A bool *lock, unsigned long *flags)
>
> 'lock' looks like an unused parameter. If so, then remove it.
>
Ok.
> > +{
> > + A A A struct page_cgroup *pc = lookup_page_cgroup(page);
> > +
> > + A A A /*
> > + A A A A * It's guaranteed that pc->mem_cgroup never changes while
> > + A A A A * lock is held
>
> Please continue comment describing what provides this guarantee. I
> assume it is because rcu_read_lock() is held by
> mem_cgroup_begin_update_page_stat(). Maybe it's best to to just make
> small reference to the locking protocol description in
> mem_cgroup_start_move().
>
Ok, I will update this.
> > + A A A A */
> > + A A A move_unlock_mem_cgroup(pc->mem_cgroup, flags);
> > +}
> > +
> > +
>
> I think it would be useful to add a small comment here declaring that
> all callers of this routine must be in a
> mem_cgroup_begin_update_page_stat(), mem_cgroup_end_update_page_stat()
> critical section to keep pc->mem_cgroup stable.
>
Sure, will do.
Thank you for review.
-Kame
> > A void mem_cgroup_update_page_stat(struct page *page,
> > A A A A A A A A A A A A A A A A enum mem_cgroup_page_stat_item idx, int val)
> > A {
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 2/6 v4] memcg: simplify move_account() check
2012-02-14 7:21 ` Greg Thelen
@ 2012-02-14 8:44 ` KAMEZAWA Hiroyuki
0 siblings, 0 replies; 16+ messages in thread
From: KAMEZAWA Hiroyuki @ 2012-02-14 8:44 UTC (permalink / raw)
To: Greg Thelen
Cc: linux-mm, linux-kernel, hannes, Michal Hocko, akpm, Ying Han,
Hugh Dickins
On Mon, 13 Feb 2012 23:21:34 -0800
Greg Thelen <gthelen@google.com> wrote:
> On Mon, Feb 13, 2012 at 7:07 PM, KAMEZAWA Hiroyuki
> <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> > From 9cdb3b63dc8d08cc2220c54c80438c13433a0d12 Mon Sep 17 00:00:00 2001
> > From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> > Date: Thu, 2 Feb 2012 10:02:39 +0900
> > Subject: [PATCH 2/6] memcg: simplify move_account() check.
> >
> > In memcg, for avoiding take-lock-irq-off at accessing page_cgroup,
> > a logic, flag + rcu_read_lock(), is used. This works as following
> >
> > A A CPU-A A A A A A A A A A A CPU-B
> > A A A A A A A A A A A A A A rcu_read_lock()
> > A A set flag
> > A A A A A A A A A A A A A A if(flag is set)
> > A A A A A A A A A A A A A A A A A take heavy lock
> > A A A A A A A A A A A A A A do job.
> > A A synchronize_rcu() A A A A rcu_read_unlock()
>
> I assume that CPU-A will take heavy lock after synchronize_rcu() when
> updating variables read by CPU-B.
>
Ah, yes. I should wrote that.
> > A memcontrol.c | A 65 ++++++++++++++++++++++-------------------------------------
> > A 1 file changed, 25 insertions(+), 40 deletions(-)
> >
> > Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
>
> Acked-by: Greg Thelen <gthelen@google.com>
>
Thank you!.
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 6/6 v4] memcg: fix performance of mem_cgroup_begin_update_page_stat()
2012-02-14 7:22 ` Greg Thelen
@ 2012-02-14 11:08 ` KAMEZAWA Hiroyuki
0 siblings, 0 replies; 16+ messages in thread
From: KAMEZAWA Hiroyuki @ 2012-02-14 11:08 UTC (permalink / raw)
To: Greg Thelen
Cc: linux-mm, linux-kernel, hannes, Michal Hocko, akpm, Ying Han,
Hugh Dickins
On Mon, 13 Feb 2012 23:22:50 -0800
Greg Thelen <gthelen@google.com> wrote:
> On Mon, Feb 13, 2012 at 7:16 PM, KAMEZAWA Hiroyuki
> <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> > From 3377fd7b6e23a5d2a368c078eae27e2b49c4f4aa Mon Sep 17 00:00:00 2001
> > From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> > Date: Mon, 6 Feb 2012 12:14:47 +0900
> > Subject: [PATCH 6/6] memcg: fix performance of mem_cgroup_begin_update_page_stat()
> >
> > mem_cgroup_begin_update_page_stat() should be very fast because
> > it's called very frequently. Now, it needs to look up page_cgroup
> > and its memcg....this is slow.
> >
> > This patch adds a global variable to check "a memcg is moving or not".
>
> s/a memcg/any memcg/
>
yes.
> > By this, the caller doesn't need to visit page_cgroup and memcg.
>
> s/By/With/
>
ok.
> > Here is a test result. A test program makes page faults onto a file,
> > MAP_SHARED and makes each page's page_mapcount(page) > 1, and free
> > the range by madvise() and page fault again. A This program causes
> > 26214400 times of page fault onto a file(size was 1G.) and shows
> > shows the cost of mem_cgroup_begin_update_page_stat().
>
> Out of curiosity, what is the performance of the mmap program before
> this series?
>
Score of 3 runs underlinux-next.
==
[root@bluextal test]# time ./mmap 1G
real 0m21.041s
user 0m6.146s
sys 0m14.625s
[root@bluextal test]# time ./mmap 1G
real 0m21.063s
user 0m6.019s
sys 0m14.776s
[root@bluextal test]# time ./mmap 1G
real 0m21.178s
user 0m6.000s
sys 0m14.849s
==
My program is attached. This program is for checking cost of updating FILE_MAPPED.
I guess sys_time scores's error rate will be 0.2-0.3 sec.
Thanks,
-Kame
==
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <sys/mman.h>
#include <fcntl.h>
void reader(int fd, int size)
{
int i, off, x;
char *addr;
addr = mmap(NULL, size, PROT_READ, MAP_SHARED, fd, 0);
for (i = 0; i < 100; i++) {
for(off = 0; off < size; off += 4096) {
x += *(addr + off);
}
madvise(addr, size, MADV_DONTNEED);
}
}
int main(int argc, char *argv[])
{
int fd;
char *addr, *c;
unsigned long size;
struct stat statbuf;
fd = open(argv[1], O_RDONLY);
if (fd < 0) {
perror("cannot open file");
return 1;
}
if (fstat(fd, &statbuf)) {
perror("fstat failed");
return 1;
}
size = statbuf.st_size;
/* mmap in 2 place. */
addr = mmap(NULL, size, PROT_READ, MAP_SHARED, fd, 0);
mlock(addr, size);
reader(fd, size);
}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2012-02-14 11:10 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-02-14 3:04 [PATCH 0/6 v4] memcg: page cgroup diet KAMEZAWA Hiroyuki
2012-02-14 3:06 ` [PATCH 1/6 v4] memcg: remove EXPORT_SYMBOL(mem_cgroup_update_page_stat) KAMEZAWA Hiroyuki
2012-02-14 7:21 ` Greg Thelen
2012-02-14 3:07 ` [PATCH 2/6 v4] memcg: simplify move_account() check KAMEZAWA Hiroyuki
2012-02-14 7:21 ` Greg Thelen
2012-02-14 8:44 ` KAMEZAWA Hiroyuki
2012-02-14 3:13 ` [PATCH 3/6 v4] memcg: remove PCG_MOVE_LOCK flag from page_cgroup KAMEZAWA Hiroyuki
2012-02-14 7:21 ` Greg Thelen
2012-02-14 3:14 ` [PATCH 4/6 v4] memcg: use new logic for page stat accounting KAMEZAWA Hiroyuki
2012-02-14 7:22 ` Greg Thelen
2012-02-14 8:43 ` KAMEZAWA Hiroyuki
2012-02-14 3:15 ` [PATCH 5/6 v4] memcg: remove PCG_FILE_MAPPED KAMEZAWA Hiroyuki
2012-02-14 7:22 ` Greg Thelen
2012-02-14 3:16 ` [PATCH 6/6 v4] memcg: fix performance of mem_cgroup_begin_update_page_stat() KAMEZAWA Hiroyuki
2012-02-14 7:22 ` Greg Thelen
2012-02-14 11:08 ` KAMEZAWA Hiroyuki
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox