* [PATCH 0/2] memcg: some fixes for -rc @ 2009-04-21 5:26 Daisuke Nishimura 2009-04-21 5:29 ` [PATCH 1/2] memcg: fix shrink_usage Daisuke Nishimura 2009-04-21 5:29 ` [PATCH 2/2] memcg: free unused swapcache at the end of page migration Daisuke Nishimura 0 siblings, 2 replies; 5+ messages in thread From: Daisuke Nishimura @ 2009-04-21 5:26 UTC (permalink / raw) To: Andrew Morton Cc: linux-mm, KAMEZAWA Hiroyuki, Balbir Singh, Daisuke Nishimura These are bugfix patches for memcg I've been testing for a few weeks. They are based on -rc1, and can be applied to mmotm(2009-04-17-15-19) too. [1/2] memcg: fix shrink_usage [2/2] memcg: free unused swapcache at the end of page migration Thanks, Daisuke Nishimura. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 1/2] memcg: fix shrink_usage 2009-04-21 5:26 [PATCH 0/2] memcg: some fixes for -rc Daisuke Nishimura @ 2009-04-21 5:29 ` Daisuke Nishimura 2009-04-21 6:54 ` KAMEZAWA Hiroyuki 2009-04-21 5:29 ` [PATCH 2/2] memcg: free unused swapcache at the end of page migration Daisuke Nishimura 1 sibling, 1 reply; 5+ messages in thread From: Daisuke Nishimura @ 2009-04-21 5:29 UTC (permalink / raw) To: Andrew Morton Cc: linux-mm, KAMEZAWA Hiroyuki, Balbir Singh, Daisuke Nishimura Current mem_cgroup_shrink_usage has two problems. 1. It doesn't call mem_cgroup_out_of_memory and doesn't update last_oom_jiffies, so pagefault_out_of_memory invokes global OOM. 2. Considering hierarchy, shrinking has to be done from the mem_over_limit, not from the memcg which the page would be charged to. mem_cgroup_try_charge_swapin does all of these works properly, so we use it and call cancel_charge_swapin when it succeeded. The name of "shrink_usage" is not appropriate for this behavior, so we change it too. Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> --- include/linux/memcontrol.h | 4 ++-- mm/memcontrol.c | 33 ++++++++++++--------------------- mm/shmem.c | 8 ++++++-- 3 files changed, 20 insertions(+), 25 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 18146c9..928b714 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -56,7 +56,7 @@ extern void mem_cgroup_move_lists(struct page *page, enum lru_list from, enum lru_list to); extern void mem_cgroup_uncharge_page(struct page *page); extern void mem_cgroup_uncharge_cache_page(struct page *page); -extern int mem_cgroup_shrink_usage(struct page *page, +extern int mem_cgroup_shmem_charge_fallback(struct page *page, struct mm_struct *mm, gfp_t gfp_mask); extern unsigned long mem_cgroup_isolate_pages(unsigned long nr_to_scan, @@ -155,7 +155,7 @@ static inline void mem_cgroup_uncharge_cache_page(struct page *page) { } -static inline int mem_cgroup_shrink_usage(struct page *page, +static inline int mem_cgroup_shmem_charge_fallback(struct page *page, struct mm_struct *mm, gfp_t gfp_mask) { return 0; diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 2fc6d6c..619b0c1 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1618,37 +1618,28 @@ void mem_cgroup_end_migration(struct mem_cgroup *mem, } /* - * A call to try to shrink memory usage under specified resource controller. - * This is typically used for page reclaiming for shmem for reducing side - * effect of page allocation from shmem, which is used by some mem_cgroup. + * A call to try to shrink memory usage on charge failure at shmem's swapin. + * Calling hierarchical_reclaim is not enough because we should update + * last_oom_jiffies to prevent pagefault_out_of_memory from invoking global OOM. + * Moreover considering hierarchy, we should reclaim from the mem_over_limit, + * not from the memcg which this page would be charged to. + * try_charge_swapin does all of these works properly. */ -int mem_cgroup_shrink_usage(struct page *page, +int mem_cgroup_shmem_charge_fallback(struct page *page, struct mm_struct *mm, gfp_t gfp_mask) { struct mem_cgroup *mem = NULL; - int progress = 0; - int retry = MEM_CGROUP_RECLAIM_RETRIES; + int ret; if (mem_cgroup_disabled()) return 0; - if (page) - mem = try_get_mem_cgroup_from_swapcache(page); - if (!mem && mm) - mem = try_get_mem_cgroup_from_mm(mm); - if (unlikely(!mem)) - return 0; - do { - progress = mem_cgroup_hierarchical_reclaim(mem, - gfp_mask, true, false); - progress += mem_cgroup_check_under_limit(mem); - } while (!progress && --retry); + ret = mem_cgroup_try_charge_swapin(mm, page, gfp_mask, &mem); + if (!ret) + mem_cgroup_cancel_charge_swapin(mem); /* it does !mem check */ - css_put(&mem->css); - if (!retry) - return -ENOMEM; - return 0; + return ret; } static DEFINE_MUTEX(set_limit_mutex); diff --git a/mm/shmem.c b/mm/shmem.c index d94d2e9..2419562 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1325,8 +1325,12 @@ repeat: shmem_swp_unmap(entry); spin_unlock(&info->lock); if (error == -ENOMEM) { - /* allow reclaim from this memory cgroup */ - error = mem_cgroup_shrink_usage(swappage, + /* + * reclaim from proper memory cgroup and + * call memcg's OOM if needed. + */ + error = mem_cgroup_shmem_charge_fallback( + swappage, current->mm, gfp); if (error) { -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 1/2] memcg: fix shrink_usage 2009-04-21 5:29 ` [PATCH 1/2] memcg: fix shrink_usage Daisuke Nishimura @ 2009-04-21 6:54 ` KAMEZAWA Hiroyuki 0 siblings, 0 replies; 5+ messages in thread From: KAMEZAWA Hiroyuki @ 2009-04-21 6:54 UTC (permalink / raw) To: Daisuke Nishimura; +Cc: Andrew Morton, linux-mm, Balbir Singh On Tue, 21 Apr 2009 14:29:18 +0900 Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> wrote: > Current mem_cgroup_shrink_usage has two problems. > > 1. It doesn't call mem_cgroup_out_of_memory and doesn't update last_oom_jiffies, > so pagefault_out_of_memory invokes global OOM. > 2. Considering hierarchy, shrinking has to be done from the mem_over_limit, > not from the memcg which the page would be charged to. > > mem_cgroup_try_charge_swapin does all of these works properly, > so we use it and call cancel_charge_swapin when it succeeded. > > The name of "shrink_usage" is not appropriate for this behavior, > so we change it too. > > Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> > Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Thank you!. -Kame > --- > include/linux/memcontrol.h | 4 ++-- > mm/memcontrol.c | 33 ++++++++++++--------------------- > mm/shmem.c | 8 ++++++-- > 3 files changed, 20 insertions(+), 25 deletions(-) > > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h > index 18146c9..928b714 100644 > --- a/include/linux/memcontrol.h > +++ b/include/linux/memcontrol.h > @@ -56,7 +56,7 @@ extern void mem_cgroup_move_lists(struct page *page, > enum lru_list from, enum lru_list to); > extern void mem_cgroup_uncharge_page(struct page *page); > extern void mem_cgroup_uncharge_cache_page(struct page *page); > -extern int mem_cgroup_shrink_usage(struct page *page, > +extern int mem_cgroup_shmem_charge_fallback(struct page *page, > struct mm_struct *mm, gfp_t gfp_mask); > > extern unsigned long mem_cgroup_isolate_pages(unsigned long nr_to_scan, > @@ -155,7 +155,7 @@ static inline void mem_cgroup_uncharge_cache_page(struct page *page) > { > } > > -static inline int mem_cgroup_shrink_usage(struct page *page, > +static inline int mem_cgroup_shmem_charge_fallback(struct page *page, > struct mm_struct *mm, gfp_t gfp_mask) > { > return 0; > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 2fc6d6c..619b0c1 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -1618,37 +1618,28 @@ void mem_cgroup_end_migration(struct mem_cgroup *mem, > } > > /* > - * A call to try to shrink memory usage under specified resource controller. > - * This is typically used for page reclaiming for shmem for reducing side > - * effect of page allocation from shmem, which is used by some mem_cgroup. > + * A call to try to shrink memory usage on charge failure at shmem's swapin. > + * Calling hierarchical_reclaim is not enough because we should update > + * last_oom_jiffies to prevent pagefault_out_of_memory from invoking global OOM. > + * Moreover considering hierarchy, we should reclaim from the mem_over_limit, > + * not from the memcg which this page would be charged to. > + * try_charge_swapin does all of these works properly. > */ > -int mem_cgroup_shrink_usage(struct page *page, > +int mem_cgroup_shmem_charge_fallback(struct page *page, > struct mm_struct *mm, > gfp_t gfp_mask) > { > struct mem_cgroup *mem = NULL; > - int progress = 0; > - int retry = MEM_CGROUP_RECLAIM_RETRIES; > + int ret; > > if (mem_cgroup_disabled()) > return 0; > - if (page) > - mem = try_get_mem_cgroup_from_swapcache(page); > - if (!mem && mm) > - mem = try_get_mem_cgroup_from_mm(mm); > - if (unlikely(!mem)) > - return 0; > > - do { > - progress = mem_cgroup_hierarchical_reclaim(mem, > - gfp_mask, true, false); > - progress += mem_cgroup_check_under_limit(mem); > - } while (!progress && --retry); > + ret = mem_cgroup_try_charge_swapin(mm, page, gfp_mask, &mem); > + if (!ret) > + mem_cgroup_cancel_charge_swapin(mem); /* it does !mem check */ > > - css_put(&mem->css); > - if (!retry) > - return -ENOMEM; > - return 0; > + return ret; > } > > static DEFINE_MUTEX(set_limit_mutex); > diff --git a/mm/shmem.c b/mm/shmem.c > index d94d2e9..2419562 100644 > --- a/mm/shmem.c > +++ b/mm/shmem.c > @@ -1325,8 +1325,12 @@ repeat: > shmem_swp_unmap(entry); > spin_unlock(&info->lock); > if (error == -ENOMEM) { > - /* allow reclaim from this memory cgroup */ > - error = mem_cgroup_shrink_usage(swappage, > + /* > + * reclaim from proper memory cgroup and > + * call memcg's OOM if needed. > + */ > + error = mem_cgroup_shmem_charge_fallback( > + swappage, > current->mm, > gfp); > if (error) { > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 2/2] memcg: free unused swapcache at the end of page migration 2009-04-21 5:26 [PATCH 0/2] memcg: some fixes for -rc Daisuke Nishimura 2009-04-21 5:29 ` [PATCH 1/2] memcg: fix shrink_usage Daisuke Nishimura @ 2009-04-21 5:29 ` Daisuke Nishimura 2009-04-21 6:53 ` KAMEZAWA Hiroyuki 1 sibling, 1 reply; 5+ messages in thread From: Daisuke Nishimura @ 2009-04-21 5:29 UTC (permalink / raw) To: Andrew Morton Cc: linux-mm, KAMEZAWA Hiroyuki, Balbir Singh, Daisuke Nishimura Reading the comments, mem_cgroup_end_migration assumes that "newpage" is under lock_page. And at the end of mem_cgroup_end_migration, mem_cgroup_uncharge_page cannot uncharge the "target" if it's SwapCache even if the owner process has already called zap_pte_range -> free_swap_and_cache. try_to_free_swap does all necessary checks(it checks page_swapcount). Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> --- mm/memcontrol.c | 7 +++++-- mm/migrate.c | 9 +++++++-- 2 files changed, 12 insertions(+), 4 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 619b0c1..f41433c 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1611,10 +1611,13 @@ void mem_cgroup_end_migration(struct mem_cgroup *mem, * There is a case for !page_mapped(). At the start of * migration, oldpage was mapped. But now, it's zapped. * But we know *target* page is not freed/reused under us. - * mem_cgroup_uncharge_page() does all necessary checks. + * mem_cgroup_uncharge_page() cannot free SwapCache, so we call + * try_to_free_swap(), which does all necessary checks. */ - if (ctype == MEM_CGROUP_CHARGE_TYPE_MAPPED) + if (ctype == MEM_CGROUP_CHARGE_TYPE_MAPPED && !page_mapped(target)) { mem_cgroup_uncharge_page(target); + try_to_free_swap(target); + } } /* diff --git a/mm/migrate.c b/mm/migrate.c index 068655d..364edf7 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -580,7 +580,7 @@ static int move_to_new_page(struct page *newpage, struct page *page) } else newpage->mapping = NULL; - unlock_page(newpage); + /* keep lock on newpage because mem_cgroup_end_migration assumes it */ return rc; } @@ -595,6 +595,7 @@ static int unmap_and_move(new_page_t get_new_page, unsigned long private, int rc = 0; int *result = NULL; struct page *newpage = get_new_page(page, private, &result); + int newpage_locked = 0; int rcu_locked = 0; int charge = 0; struct mem_cgroup *mem; @@ -671,8 +672,10 @@ static int unmap_and_move(new_page_t get_new_page, unsigned long private, /* Establish migration ptes or remove ptes */ try_to_unmap(page, 1); - if (!page_mapped(page)) + if (!page_mapped(page)) { rc = move_to_new_page(newpage, page); + newpage_locked = 1; + } if (rc) remove_migration_ptes(page, page); @@ -683,6 +686,8 @@ uncharge: if (!charge) mem_cgroup_end_migration(mem, page, newpage); unlock: + if (newpage_locked) + unlock_page(newpage); unlock_page(page); if (rc != -EAGAIN) { -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 2/2] memcg: free unused swapcache at the end of page migration 2009-04-21 5:29 ` [PATCH 2/2] memcg: free unused swapcache at the end of page migration Daisuke Nishimura @ 2009-04-21 6:53 ` KAMEZAWA Hiroyuki 0 siblings, 0 replies; 5+ messages in thread From: KAMEZAWA Hiroyuki @ 2009-04-21 6:53 UTC (permalink / raw) To: Daisuke Nishimura; +Cc: Andrew Morton, linux-mm, Balbir Singh On Tue, 21 Apr 2009 14:29:31 +0900 Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> wrote: > Reading the comments, mem_cgroup_end_migration assumes that "newpage" is under lock_page. > > And at the end of mem_cgroup_end_migration, mem_cgroup_uncharge_page cannot > uncharge the "target" if it's SwapCache even if the owner process has already > called zap_pte_range -> free_swap_and_cache. > try_to_free_swap does all necessary checks(it checks page_swapcount). > > Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Nishimura-san, I'd like to handle this issue by my own handle-stale-swapcache patch. (I'll post it today.) So, could you wait this patch for a while ? Thanks, -Kame > --- > mm/memcontrol.c | 7 +++++-- > mm/migrate.c | 9 +++++++-- > 2 files changed, 12 insertions(+), 4 deletions(-) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 619b0c1..f41433c 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -1611,10 +1611,13 @@ void mem_cgroup_end_migration(struct mem_cgroup *mem, > * There is a case for !page_mapped(). At the start of > * migration, oldpage was mapped. But now, it's zapped. > * But we know *target* page is not freed/reused under us. > - * mem_cgroup_uncharge_page() does all necessary checks. > + * mem_cgroup_uncharge_page() cannot free SwapCache, so we call > + * try_to_free_swap(), which does all necessary checks. > */ > - if (ctype == MEM_CGROUP_CHARGE_TYPE_MAPPED) > + if (ctype == MEM_CGROUP_CHARGE_TYPE_MAPPED && !page_mapped(target)) { > mem_cgroup_uncharge_page(target); > + try_to_free_swap(target); > + } > } > > /* > diff --git a/mm/migrate.c b/mm/migrate.c > index 068655d..364edf7 100644 > --- a/mm/migrate.c > +++ b/mm/migrate.c > @@ -580,7 +580,7 @@ static int move_to_new_page(struct page *newpage, struct page *page) > } else > newpage->mapping = NULL; > > - unlock_page(newpage); > + /* keep lock on newpage because mem_cgroup_end_migration assumes it */ > > return rc; > } > @@ -595,6 +595,7 @@ static int unmap_and_move(new_page_t get_new_page, unsigned long private, > int rc = 0; > int *result = NULL; > struct page *newpage = get_new_page(page, private, &result); > + int newpage_locked = 0; > int rcu_locked = 0; > int charge = 0; > struct mem_cgroup *mem; > @@ -671,8 +672,10 @@ static int unmap_and_move(new_page_t get_new_page, unsigned long private, > /* Establish migration ptes or remove ptes */ > try_to_unmap(page, 1); > > - if (!page_mapped(page)) > + if (!page_mapped(page)) { > rc = move_to_new_page(newpage, page); > + newpage_locked = 1; > + } > > if (rc) > remove_migration_ptes(page, page); > @@ -683,6 +686,8 @@ uncharge: > if (!charge) > mem_cgroup_end_migration(mem, page, newpage); > unlock: > + if (newpage_locked) > + unlock_page(newpage); > unlock_page(page); > > if (rc != -EAGAIN) { > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2009-04-21 6:54 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2009-04-21 5:26 [PATCH 0/2] memcg: some fixes for -rc Daisuke Nishimura 2009-04-21 5:29 ` [PATCH 1/2] memcg: fix shrink_usage Daisuke Nishimura 2009-04-21 6:54 ` KAMEZAWA Hiroyuki 2009-04-21 5:29 ` [PATCH 2/2] memcg: free unused swapcache at the end of page migration Daisuke Nishimura 2009-04-21 6:53 ` KAMEZAWA Hiroyuki
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox