* [RFC] Low overhead patches for the memory resource controller
@ 2009-05-13 15:32 Balbir Singh
2009-05-14 0:08 ` KAMEZAWA Hiroyuki
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Balbir Singh @ 2009-05-13 15:32 UTC (permalink / raw)
To: linux-mm
Cc: Andrew Morton, KAMEZAWA Hiroyuki, nishimura, lizf, KOSAKI Motohiro
Important: Not for inclusion, for discussion only
I've been experimenting with a version of the patches below. They add
a PCGF_ROOT flag for tracking pages belonging to the root cgroup and
disable LRU manipulation for them
Caveats:
1. I've not checked accounting, accounting might be broken
2. I've not made the root cgroup as non limitable, we need to disable
hard limits once we agree to go with this
Tests
Quick tests show an improvement with AIM9
mmotm+patch mmtom-08-may-2009
AIM9 1338.57 1338.17
Dbase 18034.16 16021.58
New Dbase 18482.24 16518.54
Shared 9935.98 8882.11
Compute 16619.81 15226.13
Comments on the approach much appreciated
Feature: Remove the overhead associated with the root cgroup
From: Balbir Singh <balbir@linux.vnet.ibm.com>
This patch changes the memory cgroup and removes the overhead associated
with accounting all pages in the root cgroup. As a side-effect, we can
no longer set a memory hard limit in the root cgroup.
A new flag is used to track page_cgroup associated with the root cgroup
pages.
---
include/linux/page_cgroup.h | 5 +++++
mm/memcontrol.c | 23 +++++++++++++++++------
mm/page_cgroup.c | 1 -
3 files changed, 22 insertions(+), 7 deletions(-)
diff --git a/include/linux/page_cgroup.h b/include/linux/page_cgroup.h
index 7339c7b..9c88e85 100644
--- a/include/linux/page_cgroup.h
+++ b/include/linux/page_cgroup.h
@@ -26,6 +26,7 @@ enum {
PCG_LOCK, /* page cgroup is locked */
PCG_CACHE, /* charged as cache */
PCG_USED, /* this object is in use. */
+ PCG_ROOT, /* page belongs to root cgroup */
};
#define TESTPCGFLAG(uname, lname) \
@@ -46,6 +47,10 @@ TESTPCGFLAG(Cache, CACHE)
TESTPCGFLAG(Used, USED)
CLEARPCGFLAG(Used, USED)
+SETPCGFLAG(Root, ROOT)
+CLEARPCGFLAG(Root, ROOT)
+TESTPCGFLAG(Root, ROOT)
+
static inline int page_cgroup_nid(struct page_cgroup *pc)
{
return page_to_nid(pc->page);
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 9712ef7..2750bed 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -43,6 +43,7 @@
struct cgroup_subsys mem_cgroup_subsys __read_mostly;
#define MEM_CGROUP_RECLAIM_RETRIES 5
+struct mem_cgroup *root_mem_cgroup __read_mostly;
#ifdef CONFIG_CGROUP_MEM_RES_CTLR_SWAP
/* Turned on only when memory cgroup is enabled && really_do_swap_account = 0 */
@@ -196,6 +197,7 @@ enum charge_type {
#define PCGF_CACHE (1UL << PCG_CACHE)
#define PCGF_USED (1UL << PCG_USED)
#define PCGF_LOCK (1UL << PCG_LOCK)
+#define PCGF_ROOT (1UL << PCG_ROOT)
static const unsigned long
pcg_default_flags[NR_CHARGE_TYPE] = {
PCGF_CACHE | PCGF_USED | PCGF_LOCK, /* File Cache */
@@ -422,6 +424,8 @@ void mem_cgroup_del_lru_list(struct page *page, enum lru_list lru)
/* can happen while we handle swapcache. */
if (list_empty(&pc->lru) || !pc->mem_cgroup)
return;
+ if (PageCgroupRoot(pc))
+ return;
/*
* We don't check PCG_USED bit. It's cleared when the "page" is finally
* removed from global LRU.
@@ -452,8 +456,8 @@ void mem_cgroup_rotate_lru_list(struct page *page, enum lru_list lru)
* For making pc->mem_cgroup visible, insert smp_rmb() here.
*/
smp_rmb();
- /* unused page is not rotated. */
- if (!PageCgroupUsed(pc))
+ /* unused or root page is not rotated. */
+ if (!PageCgroupUsed(pc) || PageCgroupRoot(pc))
return;
mz = page_cgroup_zoneinfo(pc);
list_move(&pc->lru, &mz->lists[lru]);
@@ -472,7 +476,7 @@ void mem_cgroup_add_lru_list(struct page *page, enum lru_list lru)
* For making pc->mem_cgroup visible, insert smp_rmb() here.
*/
smp_rmb();
- if (!PageCgroupUsed(pc))
+ if (!PageCgroupUsed(pc) || PageCgroupRoot(pc))
return;
mz = page_cgroup_zoneinfo(pc);
@@ -1114,9 +1118,12 @@ static void __mem_cgroup_commit_charge(struct mem_cgroup *mem,
css_put(&mem->css);
return;
}
- pc->mem_cgroup = mem;
- smp_wmb();
- pc->flags = pcg_default_flags[ctype];
+ if (mem != root_mem_cgroup) {
+ pc->mem_cgroup = mem;
+ smp_wmb();
+ pc->flags = pcg_default_flags[ctype];
+ } else
+ SetPageCgroupRoot(pc);
mem_cgroup_charge_statistics(mem, pc, true);
@@ -1521,6 +1528,8 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype)
mem_cgroup_charge_statistics(mem, pc, false);
ClearPageCgroupUsed(pc);
+ if (mem == root_mem_cgroup)
+ ClearPageCgroupRoot(pc);
/*
* pc->mem_cgroup is not cleared here. It will be accessed when it's
* freed from LRU. This is safe because uncharged page is expected not
@@ -2504,6 +2513,7 @@ mem_cgroup_create(struct cgroup_subsys *ss, struct cgroup *cont)
if (cont->parent == NULL) {
enable_swap_cgroup();
parent = NULL;
+ root_mem_cgroup = mem;
} else {
parent = mem_cgroup_from_cont(cont->parent);
mem->use_hierarchy = parent->use_hierarchy;
@@ -2532,6 +2542,7 @@ mem_cgroup_create(struct cgroup_subsys *ss, struct cgroup *cont)
return &mem->css;
free_out:
__mem_cgroup_free(mem);
+ root_mem_cgroup = NULL;
return ERR_PTR(error);
}
diff --git a/mm/page_cgroup.c b/mm/page_cgroup.c
index 09b73c5..6145ff6 100644
--- a/mm/page_cgroup.c
+++ b/mm/page_cgroup.c
@@ -276,7 +276,6 @@ void __meminit pgdat_page_cgroup_init(struct pglist_data *pgdat)
#endif
-
#ifdef CONFIG_CGROUP_MEM_RES_CTLR_SWAP
static DEFINE_MUTEX(swap_cgroup_mutex);
--
Thanks!
Balbir
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] Low overhead patches for the memory resource controller
2009-05-13 15:32 [RFC] Low overhead patches for the memory resource controller Balbir Singh
@ 2009-05-14 0:08 ` KAMEZAWA Hiroyuki
2009-05-14 0:24 ` KAMEZAWA Hiroyuki
2009-05-14 2:57 ` Balbir Singh
2009-05-14 0:42 ` KAMEZAWA Hiroyuki
2009-05-14 1:35 ` KOSAKI Motohiro
2 siblings, 2 replies; 10+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-05-14 0:08 UTC (permalink / raw)
To: balbir; +Cc: linux-mm, Andrew Morton, nishimura, lizf, KOSAKI Motohiro
On Wed, 13 May 2009 21:02:18 +0530
Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> Important: Not for inclusion, for discussion only
>
> I've been experimenting with a version of the patches below. They add
> a PCGF_ROOT flag for tracking pages belonging to the root cgroup and
> disable LRU manipulation for them
>
> Caveats:
>
> 1. I've not checked accounting, accounting might be broken
> 2. I've not made the root cgroup as non limitable, we need to disable
> hard limits once we agree to go with this
>
>
> Tests
>
> Quick tests show an improvement with AIM9
>
> mmotm+patch mmtom-08-may-2009
> AIM9 1338.57 1338.17
> Dbase 18034.16 16021.58
> New Dbase 18482.24 16518.54
> Shared 9935.98 8882.11
> Compute 16619.81 15226.13
>
> Comments on the approach much appreciated
>
> Feature: Remove the overhead associated with the root cgroup
>
> From: Balbir Singh <balbir@linux.vnet.ibm.com>
>
> This patch changes the memory cgroup and removes the overhead associated
> with accounting all pages in the root cgroup. As a side-effect, we can
> no longer set a memory hard limit in the root cgroup.
>
> A new flag is used to track page_cgroup associated with the root cgroup
> pages.
Hmm ? How about ignoring memcg completely when the thread belongs to ROOT
cgroup rather than this halfway method ?
Thanks,
-Kame
> ---
>
> include/linux/page_cgroup.h | 5 +++++
> mm/memcontrol.c | 23 +++++++++++++++++------
> mm/page_cgroup.c | 1 -
> 3 files changed, 22 insertions(+), 7 deletions(-)
>
>
> diff --git a/include/linux/page_cgroup.h b/include/linux/page_cgroup.h
> index 7339c7b..9c88e85 100644
> --- a/include/linux/page_cgroup.h
> +++ b/include/linux/page_cgroup.h
> @@ -26,6 +26,7 @@ enum {
> PCG_LOCK, /* page cgroup is locked */
> PCG_CACHE, /* charged as cache */
> PCG_USED, /* this object is in use. */
> + PCG_ROOT, /* page belongs to root cgroup */
> };
>
> #define TESTPCGFLAG(uname, lname) \
> @@ -46,6 +47,10 @@ TESTPCGFLAG(Cache, CACHE)
> TESTPCGFLAG(Used, USED)
> CLEARPCGFLAG(Used, USED)
>
> +SETPCGFLAG(Root, ROOT)
> +CLEARPCGFLAG(Root, ROOT)
> +TESTPCGFLAG(Root, ROOT)
> +
> static inline int page_cgroup_nid(struct page_cgroup *pc)
> {
> return page_to_nid(pc->page);
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 9712ef7..2750bed 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -43,6 +43,7 @@
>
> struct cgroup_subsys mem_cgroup_subsys __read_mostly;
> #define MEM_CGROUP_RECLAIM_RETRIES 5
> +struct mem_cgroup *root_mem_cgroup __read_mostly;
>
> #ifdef CONFIG_CGROUP_MEM_RES_CTLR_SWAP
> /* Turned on only when memory cgroup is enabled && really_do_swap_account = 0 */
> @@ -196,6 +197,7 @@ enum charge_type {
> #define PCGF_CACHE (1UL << PCG_CACHE)
> #define PCGF_USED (1UL << PCG_USED)
> #define PCGF_LOCK (1UL << PCG_LOCK)
> +#define PCGF_ROOT (1UL << PCG_ROOT)
> static const unsigned long
> pcg_default_flags[NR_CHARGE_TYPE] = {
> PCGF_CACHE | PCGF_USED | PCGF_LOCK, /* File Cache */
> @@ -422,6 +424,8 @@ void mem_cgroup_del_lru_list(struct page *page, enum lru_list lru)
> /* can happen while we handle swapcache. */
> if (list_empty(&pc->lru) || !pc->mem_cgroup)
> return;
> + if (PageCgroupRoot(pc))
> + return;
> /*
> * We don't check PCG_USED bit. It's cleared when the "page" is finally
> * removed from global LRU.
> @@ -452,8 +456,8 @@ void mem_cgroup_rotate_lru_list(struct page *page, enum lru_list lru)
> * For making pc->mem_cgroup visible, insert smp_rmb() here.
> */
> smp_rmb();
> - /* unused page is not rotated. */
> - if (!PageCgroupUsed(pc))
> + /* unused or root page is not rotated. */
> + if (!PageCgroupUsed(pc) || PageCgroupRoot(pc))
> return;
> mz = page_cgroup_zoneinfo(pc);
> list_move(&pc->lru, &mz->lists[lru]);
> @@ -472,7 +476,7 @@ void mem_cgroup_add_lru_list(struct page *page, enum lru_list lru)
> * For making pc->mem_cgroup visible, insert smp_rmb() here.
> */
> smp_rmb();
> - if (!PageCgroupUsed(pc))
> + if (!PageCgroupUsed(pc) || PageCgroupRoot(pc))
> return;
>
> mz = page_cgroup_zoneinfo(pc);
> @@ -1114,9 +1118,12 @@ static void __mem_cgroup_commit_charge(struct mem_cgroup *mem,
> css_put(&mem->css);
> return;
> }
> - pc->mem_cgroup = mem;
> - smp_wmb();
> - pc->flags = pcg_default_flags[ctype];
> + if (mem != root_mem_cgroup) {
> + pc->mem_cgroup = mem;
> + smp_wmb();
> + pc->flags = pcg_default_flags[ctype];
> + } else
> + SetPageCgroupRoot(pc);
>
> mem_cgroup_charge_statistics(mem, pc, true);
>
> @@ -1521,6 +1528,8 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype)
> mem_cgroup_charge_statistics(mem, pc, false);
>
> ClearPageCgroupUsed(pc);
> + if (mem == root_mem_cgroup)
> + ClearPageCgroupRoot(pc);
> /*
> * pc->mem_cgroup is not cleared here. It will be accessed when it's
> * freed from LRU. This is safe because uncharged page is expected not
> @@ -2504,6 +2513,7 @@ mem_cgroup_create(struct cgroup_subsys *ss, struct cgroup *cont)
> if (cont->parent == NULL) {
> enable_swap_cgroup();
> parent = NULL;
> + root_mem_cgroup = mem;
> } else {
> parent = mem_cgroup_from_cont(cont->parent);
> mem->use_hierarchy = parent->use_hierarchy;
> @@ -2532,6 +2542,7 @@ mem_cgroup_create(struct cgroup_subsys *ss, struct cgroup *cont)
> return &mem->css;
> free_out:
> __mem_cgroup_free(mem);
> + root_mem_cgroup = NULL;
> return ERR_PTR(error);
> }
>
> diff --git a/mm/page_cgroup.c b/mm/page_cgroup.c
> index 09b73c5..6145ff6 100644
> --- a/mm/page_cgroup.c
> +++ b/mm/page_cgroup.c
> @@ -276,7 +276,6 @@ void __meminit pgdat_page_cgroup_init(struct pglist_data *pgdat)
>
> #endif
>
> -
> #ifdef CONFIG_CGROUP_MEM_RES_CTLR_SWAP
>
> static DEFINE_MUTEX(swap_cgroup_mutex);
>
> --
> Thanks!
> Balbir
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] Low overhead patches for the memory resource controller
2009-05-14 0:08 ` KAMEZAWA Hiroyuki
@ 2009-05-14 0:24 ` KAMEZAWA Hiroyuki
2009-05-14 2:39 ` Balbir Singh
2009-05-14 2:56 ` Balbir Singh
2009-05-14 2:57 ` Balbir Singh
1 sibling, 2 replies; 10+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-05-14 0:24 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: balbir, linux-mm, Andrew Morton, nishimura, lizf, KOSAKI Motohiro
On Thu, 14 May 2009 09:08:02 +0900
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> On Wed, 13 May 2009 21:02:18 +0530
> Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
>
> > Important: Not for inclusion, for discussion only
> >
> > I've been experimenting with a version of the patches below. They add
> > a PCGF_ROOT flag for tracking pages belonging to the root cgroup and
> > disable LRU manipulation for them
> >
> > Caveats:
> >
> > 1. I've not checked accounting, accounting might be broken
> > 2. I've not made the root cgroup as non limitable, we need to disable
> > hard limits once we agree to go with this
> >
> >
> > Tests
> >
> > Quick tests show an improvement with AIM9
> >
> > mmotm+patch mmtom-08-may-2009
> > AIM9 1338.57 1338.17
> > Dbase 18034.16 16021.58
> > New Dbase 18482.24 16518.54
> > Shared 9935.98 8882.11
> > Compute 16619.81 15226.13
> >
> > Comments on the approach much appreciated
> >
> > Feature: Remove the overhead associated with the root cgroup
> >
> > From: Balbir Singh <balbir@linux.vnet.ibm.com>
> >
> > This patch changes the memory cgroup and removes the overhead associated
> > with accounting all pages in the root cgroup. As a side-effect, we can
> > no longer set a memory hard limit in the root cgroup.
> >
> > A new flag is used to track page_cgroup associated with the root cgroup
> > pages.
>
> Hmm ? How about ignoring memcg completely when the thread belongs to ROOT
> cgroup rather than this halfway method ?
>
BTW, this will make softlimit much harder. Do you have any idea on softlimit after
this patch ?
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] Low overhead patches for the memory resource controller
2009-05-13 15:32 [RFC] Low overhead patches for the memory resource controller Balbir Singh
2009-05-14 0:08 ` KAMEZAWA Hiroyuki
@ 2009-05-14 0:42 ` KAMEZAWA Hiroyuki
2009-05-14 2:42 ` Balbir Singh
2009-05-14 1:35 ` KOSAKI Motohiro
2 siblings, 1 reply; 10+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-05-14 0:42 UTC (permalink / raw)
To: balbir; +Cc: linux-mm, Andrew Morton, nishimura, lizf, KOSAKI Motohiro
On Wed, 13 May 2009 21:02:18 +0530
Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> Important: Not for inclusion, for discussion only
>
> I've been experimenting with a version of the patches below. They add
> a PCGF_ROOT flag for tracking pages belonging to the root cgroup and
> disable LRU manipulation for them
>
> Caveats:
>
> 1. I've not checked accounting, accounting might be broken
> 2. I've not made the root cgroup as non limitable, we need to disable
> hard limits once we agree to go with this
>
>
> Tests
>
> Quick tests show an improvement with AIM9
>
> mmotm+patch mmtom-08-may-2009
> AIM9 1338.57 1338.17
> Dbase 18034.16 16021.58
> New Dbase 18482.24 16518.54
> Shared 9935.98 8882.11
> Compute 16619.81 15226.13
>
> Comments on the approach much appreciated
>
> Feature: Remove the overhead associated with the root cgroup
>
> From: Balbir Singh <balbir@linux.vnet.ibm.com>
>
> This patch changes the memory cgroup and removes the overhead associated
> with accounting all pages in the root cgroup. As a side-effect, we can
> no longer set a memory hard limit in the root cgroup.
>
> A new flag is used to track page_cgroup associated with the root cgroup
> pages.
> ---
>
> include/linux/page_cgroup.h | 5 +++++
> mm/memcontrol.c | 23 +++++++++++++++++------
> mm/page_cgroup.c | 1 -
> 3 files changed, 22 insertions(+), 7 deletions(-)
>
>
> diff --git a/include/linux/page_cgroup.h b/include/linux/page_cgroup.h
> index 7339c7b..9c88e85 100644
> --- a/include/linux/page_cgroup.h
> +++ b/include/linux/page_cgroup.h
> @@ -26,6 +26,7 @@ enum {
> PCG_LOCK, /* page cgroup is locked */
> PCG_CACHE, /* charged as cache */
> PCG_USED, /* this object is in use. */
> + PCG_ROOT, /* page belongs to root cgroup */
> };
>
> #define TESTPCGFLAG(uname, lname) \
> @@ -46,6 +47,10 @@ TESTPCGFLAG(Cache, CACHE)
> TESTPCGFLAG(Used, USED)
> CLEARPCGFLAG(Used, USED)
>
> +SETPCGFLAG(Root, ROOT)
> +CLEARPCGFLAG(Root, ROOT)
> +TESTPCGFLAG(Root, ROOT)
> +
> static inline int page_cgroup_nid(struct page_cgroup *pc)
> {
> return page_to_nid(pc->page);
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 9712ef7..2750bed 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -43,6 +43,7 @@
>
> struct cgroup_subsys mem_cgroup_subsys __read_mostly;
> #define MEM_CGROUP_RECLAIM_RETRIES 5
> +struct mem_cgroup *root_mem_cgroup __read_mostly;
>
> #ifdef CONFIG_CGROUP_MEM_RES_CTLR_SWAP
> /* Turned on only when memory cgroup is enabled && really_do_swap_account = 0 */
> @@ -196,6 +197,7 @@ enum charge_type {
> #define PCGF_CACHE (1UL << PCG_CACHE)
> #define PCGF_USED (1UL << PCG_USED)
> #define PCGF_LOCK (1UL << PCG_LOCK)
> +#define PCGF_ROOT (1UL << PCG_ROOT)
> static const unsigned long
> pcg_default_flags[NR_CHARGE_TYPE] = {
> PCGF_CACHE | PCGF_USED | PCGF_LOCK, /* File Cache */
> @@ -422,6 +424,8 @@ void mem_cgroup_del_lru_list(struct page *page, enum lru_list lru)
> /* can happen while we handle swapcache. */
> if (list_empty(&pc->lru) || !pc->mem_cgroup)
> return;
> + if (PageCgroupRoot(pc))
> + return;
> /*
> * We don't check PCG_USED bit. It's cleared when the "page" is finally
> * removed from global LRU.
> @@ -452,8 +456,8 @@ void mem_cgroup_rotate_lru_list(struct page *page, enum lru_list lru)
> * For making pc->mem_cgroup visible, insert smp_rmb() here.
> */
> smp_rmb();
> - /* unused page is not rotated. */
> - if (!PageCgroupUsed(pc))
> + /* unused or root page is not rotated. */
> + if (!PageCgroupUsed(pc) || PageCgroupRoot(pc))
> return;
> mz = page_cgroup_zoneinfo(pc);
> list_move(&pc->lru, &mz->lists[lru]);
> @@ -472,7 +476,7 @@ void mem_cgroup_add_lru_list(struct page *page, enum lru_list lru)
> * For making pc->mem_cgroup visible, insert smp_rmb() here.
> */
> smp_rmb();
> - if (!PageCgroupUsed(pc))
> + if (!PageCgroupUsed(pc) || PageCgroupRoot(pc))
> return;
>
> mz = page_cgroup_zoneinfo(pc);
> @@ -1114,9 +1118,12 @@ static void __mem_cgroup_commit_charge(struct mem_cgroup *mem,
> css_put(&mem->css);
> return;
> }
> - pc->mem_cgroup = mem;
> - smp_wmb();
> - pc->flags = pcg_default_flags[ctype];
> + if (mem != root_mem_cgroup) {
> + pc->mem_cgroup = mem;
> + smp_wmb();
> + pc->flags = pcg_default_flags[ctype];
> + } else
> + SetPageCgroupRoot(pc);
>
This means
PCG_USED is not set. (then uncharge_common will be skipped completely.)
LOCK bit is dropped here.
After fix, the test result will change.
Thanks,
-Kame
> mem_cgroup_charge_statistics(mem, pc, true);
>
> @@ -1521,6 +1528,8 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype)
> mem_cgroup_charge_statistics(mem, pc, false);
>
> ClearPageCgroupUsed(pc);
> + if (mem == root_mem_cgroup)
> + ClearPageCgroupRoot(pc);
> /*
> * pc->mem_cgroup is not cleared here. It will be accessed when it's
> * freed from LRU. This is safe because uncharged page is expected not
> @@ -2504,6 +2513,7 @@ mem_cgroup_create(struct cgroup_subsys *ss, struct cgroup *cont)
> if (cont->parent == NULL) {
> enable_swap_cgroup();
> parent = NULL;
> + root_mem_cgroup = mem;
> } else {
> parent = mem_cgroup_from_cont(cont->parent);
> mem->use_hierarchy = parent->use_hierarchy;
> @@ -2532,6 +2542,7 @@ mem_cgroup_create(struct cgroup_subsys *ss, struct cgroup *cont)
> return &mem->css;
> free_out:
> __mem_cgroup_free(mem);
> + root_mem_cgroup = NULL;
> return ERR_PTR(error);
> }
>
> diff --git a/mm/page_cgroup.c b/mm/page_cgroup.c
> index 09b73c5..6145ff6 100644
> --- a/mm/page_cgroup.c
> +++ b/mm/page_cgroup.c
> @@ -276,7 +276,6 @@ void __meminit pgdat_page_cgroup_init(struct pglist_data *pgdat)
>
> #endif
>
> -
> #ifdef CONFIG_CGROUP_MEM_RES_CTLR_SWAP
>
> static DEFINE_MUTEX(swap_cgroup_mutex);
>
> --
> Thanks!
> Balbir
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] Low overhead patches for the memory resource controller
2009-05-13 15:32 [RFC] Low overhead patches for the memory resource controller Balbir Singh
2009-05-14 0:08 ` KAMEZAWA Hiroyuki
2009-05-14 0:42 ` KAMEZAWA Hiroyuki
@ 2009-05-14 1:35 ` KOSAKI Motohiro
2 siblings, 0 replies; 10+ messages in thread
From: KOSAKI Motohiro @ 2009-05-14 1:35 UTC (permalink / raw)
To: balbir
Cc: kosaki.motohiro, linux-mm, Andrew Morton, KAMEZAWA Hiroyuki,
nishimura, lizf
> Important: Not for inclusion, for discussion only
>
> I've been experimenting with a version of the patches below. They add
> a PCGF_ROOT flag for tracking pages belonging to the root cgroup and
> disable LRU manipulation for them
>
> Caveats:
>
> 1. I've not checked accounting, accounting might be broken
> 2. I've not made the root cgroup as non limitable, we need to disable
> hard limits once we agree to go with this
>
>
> Tests
>
> Quick tests show an improvement with AIM9
>
> mmotm+patch mmtom-08-may-2009
> AIM9 1338.57 1338.17
> Dbase 18034.16 16021.58
> New Dbase 18482.24 16518.54
> Shared 9935.98 8882.11
> Compute 16619.81 15226.13
>
> Comments on the approach much appreciated
>
> Feature: Remove the overhead associated with the root cgroup
>
> From: Balbir Singh <balbir@linux.vnet.ibm.com>
>
> This patch changes the memory cgroup and removes the overhead associated
> with accounting all pages in the root cgroup. As a side-effect, we can
> no longer set a memory hard limit in the root cgroup.
>
> A new flag is used to track page_cgroup associated with the root cgroup
> pages.
I think this is right direction path. typical desktop user don't use
non-root cgroup nor cgroup disabling boot parameter.
this patch increase their user experience.
I hope you fix rest technical issue.
thanks.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] Low overhead patches for the memory resource controller
2009-05-14 0:24 ` KAMEZAWA Hiroyuki
@ 2009-05-14 2:39 ` Balbir Singh
2009-05-14 3:00 ` KAMEZAWA Hiroyuki
2009-05-14 2:56 ` Balbir Singh
1 sibling, 1 reply; 10+ messages in thread
From: Balbir Singh @ 2009-05-14 2:39 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm, Andrew Morton, nishimura, lizf, KOSAKI Motohiro
* KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-05-14 09:24:05]:
> On Thu, 14 May 2009 09:08:02 +0900
> KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
>
> > On Wed, 13 May 2009 21:02:18 +0530
> > Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> >
> > > Important: Not for inclusion, for discussion only
> > >
> > > I've been experimenting with a version of the patches below. They add
> > > a PCGF_ROOT flag for tracking pages belonging to the root cgroup and
> > > disable LRU manipulation for them
> > >
> > > Caveats:
> > >
> > > 1. I've not checked accounting, accounting might be broken
> > > 2. I've not made the root cgroup as non limitable, we need to disable
> > > hard limits once we agree to go with this
> > >
> > >
> > > Tests
> > >
> > > Quick tests show an improvement with AIM9
> > >
> > > mmotm+patch mmtom-08-may-2009
> > > AIM9 1338.57 1338.17
> > > Dbase 18034.16 16021.58
> > > New Dbase 18482.24 16518.54
> > > Shared 9935.98 8882.11
> > > Compute 16619.81 15226.13
> > >
> > > Comments on the approach much appreciated
> > >
> > > Feature: Remove the overhead associated with the root cgroup
> > >
> > > From: Balbir Singh <balbir@linux.vnet.ibm.com>
> > >
> > > This patch changes the memory cgroup and removes the overhead associated
> > > with accounting all pages in the root cgroup. As a side-effect, we can
> > > no longer set a memory hard limit in the root cgroup.
> > >
> > > A new flag is used to track page_cgroup associated with the root cgroup
> > > pages.
> >
> > Hmm ? How about ignoring memcg completely when the thread belongs to ROOT
> > cgroup rather than this halfway method ?
> >
> BTW, this will make softlimit much harder. Do you have any idea on softlimit after
> this patch ?
>
Why would this make soft limit much harder? Since we charge up
hierarchially even now we ignore a cgroup if its soft limit is not
set. I am not sure I understand why.
--
Balbir
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] Low overhead patches for the memory resource controller
2009-05-14 0:42 ` KAMEZAWA Hiroyuki
@ 2009-05-14 2:42 ` Balbir Singh
0 siblings, 0 replies; 10+ messages in thread
From: Balbir Singh @ 2009-05-14 2:42 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm, Andrew Morton, nishimura, lizf, KOSAKI Motohiro
* KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-05-14 09:42:23]:
> On Wed, 13 May 2009 21:02:18 +0530
> Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
>
> > Important: Not for inclusion, for discussion only
> >
> > I've been experimenting with a version of the patches below. They add
> > a PCGF_ROOT flag for tracking pages belonging to the root cgroup and
> > disable LRU manipulation for them
> >
> > Caveats:
> >
> > 1. I've not checked accounting, accounting might be broken
> > 2. I've not made the root cgroup as non limitable, we need to disable
> > hard limits once we agree to go with this
> >
> >
> > Tests
> >
> > Quick tests show an improvement with AIM9
> >
> > mmotm+patch mmtom-08-may-2009
> > AIM9 1338.57 1338.17
> > Dbase 18034.16 16021.58
> > New Dbase 18482.24 16518.54
> > Shared 9935.98 8882.11
> > Compute 16619.81 15226.13
> >
> > Comments on the approach much appreciated
> >
> > Feature: Remove the overhead associated with the root cgroup
> >
> > From: Balbir Singh <balbir@linux.vnet.ibm.com>
> >
> > This patch changes the memory cgroup and removes the overhead associated
> > with accounting all pages in the root cgroup. As a side-effect, we can
> > no longer set a memory hard limit in the root cgroup.
> >
> > A new flag is used to track page_cgroup associated with the root cgroup
> > pages.
> > ---
> >
> > include/linux/page_cgroup.h | 5 +++++
> > mm/memcontrol.c | 23 +++++++++++++++++------
> > mm/page_cgroup.c | 1 -
> > 3 files changed, 22 insertions(+), 7 deletions(-)
> >
> >
> > diff --git a/include/linux/page_cgroup.h b/include/linux/page_cgroup.h
> > index 7339c7b..9c88e85 100644
> > --- a/include/linux/page_cgroup.h
> > +++ b/include/linux/page_cgroup.h
> > @@ -26,6 +26,7 @@ enum {
> > PCG_LOCK, /* page cgroup is locked */
> > PCG_CACHE, /* charged as cache */
> > PCG_USED, /* this object is in use. */
> > + PCG_ROOT, /* page belongs to root cgroup */
> > };
> >
> > #define TESTPCGFLAG(uname, lname) \
> > @@ -46,6 +47,10 @@ TESTPCGFLAG(Cache, CACHE)
> > TESTPCGFLAG(Used, USED)
> > CLEARPCGFLAG(Used, USED)
> >
> > +SETPCGFLAG(Root, ROOT)
> > +CLEARPCGFLAG(Root, ROOT)
> > +TESTPCGFLAG(Root, ROOT)
> > +
> > static inline int page_cgroup_nid(struct page_cgroup *pc)
> > {
> > return page_to_nid(pc->page);
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index 9712ef7..2750bed 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -43,6 +43,7 @@
> >
> > struct cgroup_subsys mem_cgroup_subsys __read_mostly;
> > #define MEM_CGROUP_RECLAIM_RETRIES 5
> > +struct mem_cgroup *root_mem_cgroup __read_mostly;
> >
> > #ifdef CONFIG_CGROUP_MEM_RES_CTLR_SWAP
> > /* Turned on only when memory cgroup is enabled && really_do_swap_account = 0 */
> > @@ -196,6 +197,7 @@ enum charge_type {
> > #define PCGF_CACHE (1UL << PCG_CACHE)
> > #define PCGF_USED (1UL << PCG_USED)
> > #define PCGF_LOCK (1UL << PCG_LOCK)
> > +#define PCGF_ROOT (1UL << PCG_ROOT)
> > static const unsigned long
> > pcg_default_flags[NR_CHARGE_TYPE] = {
> > PCGF_CACHE | PCGF_USED | PCGF_LOCK, /* File Cache */
> > @@ -422,6 +424,8 @@ void mem_cgroup_del_lru_list(struct page *page, enum lru_list lru)
> > /* can happen while we handle swapcache. */
> > if (list_empty(&pc->lru) || !pc->mem_cgroup)
> > return;
> > + if (PageCgroupRoot(pc))
> > + return;
> > /*
> > * We don't check PCG_USED bit. It's cleared when the "page" is finally
> > * removed from global LRU.
> > @@ -452,8 +456,8 @@ void mem_cgroup_rotate_lru_list(struct page *page, enum lru_list lru)
> > * For making pc->mem_cgroup visible, insert smp_rmb() here.
> > */
> > smp_rmb();
> > - /* unused page is not rotated. */
> > - if (!PageCgroupUsed(pc))
> > + /* unused or root page is not rotated. */
> > + if (!PageCgroupUsed(pc) || PageCgroupRoot(pc))
> > return;
> > mz = page_cgroup_zoneinfo(pc);
> > list_move(&pc->lru, &mz->lists[lru]);
> > @@ -472,7 +476,7 @@ void mem_cgroup_add_lru_list(struct page *page, enum lru_list lru)
> > * For making pc->mem_cgroup visible, insert smp_rmb() here.
> > */
> > smp_rmb();
> > - if (!PageCgroupUsed(pc))
> > + if (!PageCgroupUsed(pc) || PageCgroupRoot(pc))
> > return;
> >
> > mz = page_cgroup_zoneinfo(pc);
> > @@ -1114,9 +1118,12 @@ static void __mem_cgroup_commit_charge(struct mem_cgroup *mem,
> > css_put(&mem->css);
> > return;
> > }
> > - pc->mem_cgroup = mem;
> > - smp_wmb();
> > - pc->flags = pcg_default_flags[ctype];
> > + if (mem != root_mem_cgroup) {
> > + pc->mem_cgroup = mem;
> > + smp_wmb();
> > + pc->flags = pcg_default_flags[ctype];
> > + } else
> > + SetPageCgroupRoot(pc);
> >
> This means
> PCG_USED is not set. (then uncharge_common will be skipped completely.)
> LOCK bit is dropped here.
>
> After fix, the test result will change.
>
Yep, I've not checked the impact on accounting. I think I need a check
to see for !Used && Root to make sure the accounting is not broken.
I'll test again.
--
Balbir
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] Low overhead patches for the memory resource controller
2009-05-14 0:24 ` KAMEZAWA Hiroyuki
2009-05-14 2:39 ` Balbir Singh
@ 2009-05-14 2:56 ` Balbir Singh
1 sibling, 0 replies; 10+ messages in thread
From: Balbir Singh @ 2009-05-14 2:56 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm, Andrew Morton, nishimura, lizf, KOSAKI Motohiro
* KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-05-14 09:24:05]:
> On Thu, 14 May 2009 09:08:02 +0900
> KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
>
> > On Wed, 13 May 2009 21:02:18 +0530
> > Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> >
> > > Important: Not for inclusion, for discussion only
> > >
> > > I've been experimenting with a version of the patches below. They add
> > > a PCGF_ROOT flag for tracking pages belonging to the root cgroup and
> > > disable LRU manipulation for them
> > >
> > > Caveats:
> > >
> > > 1. I've not checked accounting, accounting might be broken
> > > 2. I've not made the root cgroup as non limitable, we need to disable
> > > hard limits once we agree to go with this
> > >
> > >
> > > Tests
> > >
> > > Quick tests show an improvement with AIM9
> > >
> > > mmotm+patch mmtom-08-may-2009
> > > AIM9 1338.57 1338.17
> > > Dbase 18034.16 16021.58
> > > New Dbase 18482.24 16518.54
> > > Shared 9935.98 8882.11
> > > Compute 16619.81 15226.13
> > >
> > > Comments on the approach much appreciated
> > >
> > > Feature: Remove the overhead associated with the root cgroup
> > >
> > > From: Balbir Singh <balbir@linux.vnet.ibm.com>
> > >
> > > This patch changes the memory cgroup and removes the overhead associated
> > > with accounting all pages in the root cgroup. As a side-effect, we can
> > > no longer set a memory hard limit in the root cgroup.
> > >
> > > A new flag is used to track page_cgroup associated with the root cgroup
> > > pages.
> >
> > Hmm ? How about ignoring memcg completely when the thread belongs to ROOT
> > cgroup rather than this halfway method ?
> >
> BTW, this will make softlimit much harder. Do you have any idea on softlimit after
> this patch ?
>
Quick Clarificaiton, will my patches make soft limit hard or will the
suggestio above do that?
--
Balbir
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] Low overhead patches for the memory resource controller
2009-05-14 0:08 ` KAMEZAWA Hiroyuki
2009-05-14 0:24 ` KAMEZAWA Hiroyuki
@ 2009-05-14 2:57 ` Balbir Singh
1 sibling, 0 replies; 10+ messages in thread
From: Balbir Singh @ 2009-05-14 2:57 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm, Andrew Morton, nishimura, lizf, KOSAKI Motohiro
* KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-05-14 09:08:02]:
> On Wed, 13 May 2009 21:02:18 +0530
> Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
>
> > Important: Not for inclusion, for discussion only
> >
> > I've been experimenting with a version of the patches below. They add
> > a PCGF_ROOT flag for tracking pages belonging to the root cgroup and
> > disable LRU manipulation for them
> >
> > Caveats:
> >
> > 1. I've not checked accounting, accounting might be broken
> > 2. I've not made the root cgroup as non limitable, we need to disable
> > hard limits once we agree to go with this
> >
> >
> > Tests
> >
> > Quick tests show an improvement with AIM9
> >
> > mmotm+patch mmtom-08-may-2009
> > AIM9 1338.57 1338.17
> > Dbase 18034.16 16021.58
> > New Dbase 18482.24 16518.54
> > Shared 9935.98 8882.11
> > Compute 16619.81 15226.13
> >
> > Comments on the approach much appreciated
> >
> > Feature: Remove the overhead associated with the root cgroup
> >
> > From: Balbir Singh <balbir@linux.vnet.ibm.com>
> >
> > This patch changes the memory cgroup and removes the overhead associated
> > with accounting all pages in the root cgroup. As a side-effect, we can
> > no longer set a memory hard limit in the root cgroup.
> >
> > A new flag is used to track page_cgroup associated with the root cgroup
> > pages.
>
> Hmm ? How about ignoring memcg completely when the thread belongs to ROOT
> cgroup rather than this halfway method ?
>
I wanted to keep root cgroup accounting, specially useful in the case
of hierarchical setup and even otherwise, we don't want those values
to disappear. May be in the longer run, we could decide to move
provided we get sufficient time to deprecate root cgroup stats.
--
Balbir
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] Low overhead patches for the memory resource controller
2009-05-14 2:39 ` Balbir Singh
@ 2009-05-14 3:00 ` KAMEZAWA Hiroyuki
0 siblings, 0 replies; 10+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-05-14 3:00 UTC (permalink / raw)
To: balbir; +Cc: linux-mm, Andrew Morton, nishimura, lizf, KOSAKI Motohiro
On Thu, 14 May 2009 08:09:30 +0530
Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> * KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-05-14 09:24:05]:
>
> > On Thu, 14 May 2009 09:08:02 +0900
> > KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> >
> > > On Wed, 13 May 2009 21:02:18 +0530
> > > Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> > >
> > > > Important: Not for inclusion, for discussion only
> > > >
> > > > I've been experimenting with a version of the patches below. They add
> > > > a PCGF_ROOT flag for tracking pages belonging to the root cgroup and
> > > > disable LRU manipulation for them
> > > >
> > > > Caveats:
> > > >
> > > > 1. I've not checked accounting, accounting might be broken
> > > > 2. I've not made the root cgroup as non limitable, we need to disable
> > > > hard limits once we agree to go with this
> > > >
> > > >
> > > > Tests
> > > >
> > > > Quick tests show an improvement with AIM9
> > > >
> > > > mmotm+patch mmtom-08-may-2009
> > > > AIM9 1338.57 1338.17
> > > > Dbase 18034.16 16021.58
> > > > New Dbase 18482.24 16518.54
> > > > Shared 9935.98 8882.11
> > > > Compute 16619.81 15226.13
> > > >
> > > > Comments on the approach much appreciated
> > > >
> > > > Feature: Remove the overhead associated with the root cgroup
> > > >
> > > > From: Balbir Singh <balbir@linux.vnet.ibm.com>
> > > >
> > > > This patch changes the memory cgroup and removes the overhead associated
> > > > with accounting all pages in the root cgroup. As a side-effect, we can
> > > > no longer set a memory hard limit in the root cgroup.
> > > >
> > > > A new flag is used to track page_cgroup associated with the root cgroup
> > > > pages.
> > >
> > > Hmm ? How about ignoring memcg completely when the thread belongs to ROOT
> > > cgroup rather than this halfway method ?
> > >
> > BTW, this will make softlimit much harder. Do you have any idea on softlimit after
> > this patch ?
> >
>
> Why would this make soft limit much harder? Since we charge up
> hierarchially even now we ignore a cgroup if its soft limit is not
> set. I am not sure I understand why.
>
I doubt I misunderstand something.
Anyway, disabling soft limit to root cgroup is necessary because we have no LRU.
Sorry for noise.
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2009-05-14 3:02 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-05-13 15:32 [RFC] Low overhead patches for the memory resource controller Balbir Singh
2009-05-14 0:08 ` KAMEZAWA Hiroyuki
2009-05-14 0:24 ` KAMEZAWA Hiroyuki
2009-05-14 2:39 ` Balbir Singh
2009-05-14 3:00 ` KAMEZAWA Hiroyuki
2009-05-14 2:56 ` Balbir Singh
2009-05-14 2:57 ` Balbir Singh
2009-05-14 0:42 ` KAMEZAWA Hiroyuki
2009-05-14 2:42 ` Balbir Singh
2009-05-14 1:35 ` KOSAKI Motohiro
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox