From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
To: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: linux-mm@kvack.org, mgorman@suse.de, dhillf@gmail.com,
aarcange@redhat.com, mhocko@suse.cz, akpm@linux-foundation.org,
hannes@cmpxchg.org, linux-kernel@vger.kernel.org,
cgroups@vger.kernel.org
Subject: Re: [PATCH -V5 12/14] memcg: move HugeTLB resource count to parent cgroup on memcg removal
Date: Mon, 09 Apr 2012 15:16:13 +0900 [thread overview]
Message-ID: <4F827EAD.9080300@jp.fujitsu.com> (raw)
In-Reply-To: <1333738260-1329-13-git-send-email-aneesh.kumar@linux.vnet.ibm.com>
(2012/04/07 3:50), Aneesh Kumar K.V wrote:
> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
>
> This add support for memcg removal with HugeTLB resource usage.
>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Hmm
> +#ifdef CONFIG_MEM_RES_CTLR_HUGETLB
> +/*
> + * Force the memcg to empty the hugetlb resources by moving them to
> + * the parent cgroup. We can fail if the parent cgroup's limit prevented
> + * the charging. This should only happen if use_hierarchy is not set.
> + */
> +int hugetlb_force_memcg_empty(struct cgroup *cgroup)
> +{
> + struct hstate *h;
> + struct page *page;
> + int ret = 0, idx = 0;
> +
> + do {
> + if (cgroup_task_count(cgroup) || !list_empty(&cgroup->children))
> + goto out;
> + /*
> + * If the task doing the cgroup_rmdir got a signal
> + * we don't really need to loop till the hugetlb resource
> + * usage become zero.
> + */
> + if (signal_pending(current)) {
> + ret = -EINTR;
> + goto out;
> + }
> + for_each_hstate(h) {
> + spin_lock(&hugetlb_lock);
> + list_for_each_entry(page, &h->hugepage_activelist, lru) {
> + ret = mem_cgroup_move_hugetlb_parent(idx, cgroup, page);
> + if (ret) {
> + spin_unlock(&hugetlb_lock);
> + goto out;
> + }
> + }
> + spin_unlock(&hugetlb_lock);
> + idx++;
> + }
> + cond_resched();
> + } while (mem_cgroup_have_hugetlb_usage(cgroup));
> +out:
> + return ret;
> +}
> +#endif
> +
> /* Should be called on processing a hugepagesz=... option */
> void __init hugetlb_add_hstate(unsigned order)
> {
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 7d3330e..7b6e79a 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -3228,9 +3228,11 @@ static inline int mem_cgroup_move_swap_account(swp_entry_t entry,
> #endif
>
> #ifdef CONFIG_MEM_RES_CTLR_HUGETLB
> -static bool mem_cgroup_have_hugetlb_usage(struct mem_cgroup *memcg)
> +bool mem_cgroup_have_hugetlb_usage(struct cgroup *cgroup)
> {
> int idx;
> + struct mem_cgroup *memcg = mem_cgroup_from_cont(cgroup);
> +
> for (idx = 0; idx < hugetlb_max_hstate; idx++) {
> if ((res_counter_read_u64(&memcg->hugepage[idx], RES_USAGE)) > 0)
> return 1;
> @@ -3328,10 +3330,57 @@ void mem_cgroup_hugetlb_uncharge_memcg(int idx, unsigned long nr_pages,
> res_counter_uncharge(&memcg->hugepage[idx], csize);
> return;
> }
> -#else
> -static bool mem_cgroup_have_hugetlb_usage(struct mem_cgroup *memcg)
> +
> +int mem_cgroup_move_hugetlb_parent(int idx, struct cgroup *cgroup,
> + struct page *page)
> {
> - return 0;
> + struct page_cgroup *pc;
> + int csize, ret = 0;
> + struct res_counter *fail_res;
> + struct cgroup *pcgrp = cgroup->parent;
> + struct mem_cgroup *parent = mem_cgroup_from_cont(pcgrp);
> + struct mem_cgroup *memcg = mem_cgroup_from_cont(cgroup);
> +
> + if (!get_page_unless_zero(page))
> + goto out;
> +
> + pc = lookup_page_cgroup(page);
> + lock_page_cgroup(pc);
> + if (!PageCgroupUsed(pc) || pc->mem_cgroup != memcg)
> + goto err_out;
> +
> + csize = PAGE_SIZE << compound_order(page);
> + /*
> + * uncharge from child and charge the parent. If we have
> + * use_hierarchy set, we can never fail here. In-order to make
> + * sure we don't get -ENOMEM on parent charge, we first uncharge
> + * the child and then charge the parent.
> + */
> + if (parent->use_hierarchy) {
> + res_counter_uncharge(&memcg->hugepage[idx], csize);
> + if (!mem_cgroup_is_root(parent))
> + ret = res_counter_charge(&parent->hugepage[idx],
> + csize, &fail_res);
Ah, why is !mem_cgroup_is_root() checked ? no res_counter update for
root cgroup ?
I think it's better to have res_counter_move_parent()...to do ops in atomic.
(I'll post a patch for that for my purpose). OR, just ignore res->usage if
parent->use_hierarchy == 1.
uncharge->charge will have a race.
> + } else {
> + if (!mem_cgroup_is_root(parent)) {
> + ret = res_counter_charge(&parent->hugepage[idx],
> + csize, &fail_res);
> + if (ret) {
> + ret = -EBUSY;
> + goto err_out;
> + }
> + }
> + res_counter_uncharge(&memcg->hugepage[idx], csize);
> + }
Just a notice. Recently, Tejun changed failure of pre_destory() to show WARNING.
Then, I'd like to move the usage to the root cgroup if use_hierarchy=0.
Will it work for you ?
> + /*
> + * caller should have done css_get
> + */
Could you explain meaning of this comment ?
Thanks,
-Kame
> + pc->mem_cgroup = parent;
> +err_out:
> + unlock_page_cgroup(pc);
> + put_page(page);
> +out:
> + return ret;
> }
> #endif /* CONFIG_MEM_RES_CTLR_HUGETLB */
>
> @@ -3852,6 +3901,11 @@ static int mem_cgroup_force_empty(struct mem_cgroup *memcg, bool free_all)
> /* should free all ? */
> if (free_all)
> goto try_to_free;
> +
> + /* move the hugetlb charges */
> + ret = hugetlb_force_memcg_empty(cgrp);
> + if (ret)
> + goto out;
> move_account:
> do {
> ret = -EBUSY;
> @@ -5172,12 +5226,6 @@ free_out:
> static int mem_cgroup_pre_destroy(struct cgroup *cont)
> {
> struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
> - /*
> - * Don't allow memcg removal if we have HugeTLB resource
> - * usage.
> - */
> - if (mem_cgroup_have_hugetlb_usage(memcg))
> - return -EBUSY;
>
> return mem_cgroup_force_empty(memcg, false);
> }
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2012-04-09 6:18 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-06 18:50 [PATCH -V5 00/14] memcg: Add memcg extension to control HugeTLB allocation Aneesh Kumar K.V
2012-04-06 18:50 ` [PATCH -V5 01/14] hugetlb: rename max_hstate to hugetlb_max_hstate Aneesh Kumar K.V
2012-04-06 18:50 ` [PATCH -V5 02/14] hugetlbfs: don't use ERR_PTR with VM_FAULT* values Aneesh Kumar K.V
2012-04-06 18:50 ` [PATCH -V5 03/14] hugetlbfs: Add an inline helper for finding hstate index Aneesh Kumar K.V
2012-04-06 18:50 ` [PATCH -V5 04/14] hugetlb: Use mmu_gather instead of a temporary linked list for accumulating pages Aneesh Kumar K.V
2012-04-09 5:36 ` KAMEZAWA Hiroyuki
2012-04-06 18:50 ` [PATCH -V5 05/14] hugetlb: Avoid taking i_mmap_mutex in unmap_single_vma for hugetlb Aneesh Kumar K.V
2012-04-06 18:50 ` [PATCH -V5 06/14] hugetlb: Simplify migrate_huge_page Aneesh Kumar K.V
2012-04-09 5:47 ` KAMEZAWA Hiroyuki
2012-04-09 8:36 ` Aneesh Kumar K.V
2012-04-06 18:50 ` [PATCH -V5 07/14] memcg: Add HugeTLB extension Aneesh Kumar K.V
2012-04-09 6:04 ` KAMEZAWA Hiroyuki
2012-04-09 8:43 ` Aneesh Kumar K.V
2012-04-09 9:00 ` KAMEZAWA Hiroyuki
2012-04-06 18:50 ` [PATCH -V5 08/14] hugetlb: add charge/uncharge calls for HugeTLB alloc/free Aneesh Kumar K.V
2012-04-06 18:50 ` [PATCH -V5 09/14] memcg: track resource index in cftype private Aneesh Kumar K.V
2012-04-09 5:56 ` KAMEZAWA Hiroyuki
2012-04-06 18:50 ` [PATCH -V5 10/14] hugetlbfs: Add memcg control files for hugetlbfs Aneesh Kumar K.V
2012-04-09 6:00 ` KAMEZAWA Hiroyuki
2012-04-09 8:46 ` Aneesh Kumar K.V
2012-04-06 18:50 ` [PATCH -V5 11/14] hugetlbfs: Add a list for tracking in-use HugeTLB pages Aneesh Kumar K.V
2012-04-06 18:50 ` [PATCH -V5 12/14] memcg: move HugeTLB resource count to parent cgroup on memcg removal Aneesh Kumar K.V
2012-04-09 6:16 ` KAMEZAWA Hiroyuki [this message]
2012-04-09 10:00 ` Aneesh Kumar K.V
2012-04-10 6:55 ` KAMEZAWA Hiroyuki
2012-04-06 18:50 ` [PATCH -V5 13/14] hugetlb: migrate memcg info from oldpage to new page during migration Aneesh Kumar K.V
2012-04-06 18:51 ` [PATCH -V5 14/14] memcg: Add memory controller documentation for hugetlb management Aneesh Kumar K.V
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F827EAD.9080300@jp.fujitsu.com \
--to=kamezawa.hiroyu@jp.fujitsu.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.vnet.ibm.com \
--cc=cgroups@vger.kernel.org \
--cc=dhillf@gmail.com \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=mhocko@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox