linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: linux-mm@kvack.org, mgorman@suse.de, dhillf@gmail.com,
	aarcange@redhat.com, mhocko@suse.cz, akpm@linux-foundation.org,
	hannes@cmpxchg.org, linux-kernel@vger.kernel.org,
	cgroups@vger.kernel.org
Subject: Re: [PATCH -V5 12/14] memcg: move HugeTLB resource count to parent cgroup on memcg removal
Date: Mon, 09 Apr 2012 15:30:42 +0530	[thread overview]
Message-ID: <87ty0tcjhx.fsf@skywalker.in.ibm.com> (raw)
In-Reply-To: <4F827EAD.9080300@jp.fujitsu.com>

KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> writes:

> (2012/04/07 3:50), Aneesh Kumar K.V wrote:
>
>> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
>> 
>> This add support for memcg removal with HugeTLB resource usage.
>> 
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
>
>
> Hmm 
>
>

....
...

>> +	csize = PAGE_SIZE << compound_order(page);
>> +	/*
>> +	 * uncharge from child and charge the parent. If we have
>> +	 * use_hierarchy set, we can never fail here. In-order to make
>> +	 * sure we don't get -ENOMEM on parent charge, we first uncharge
>> +	 * the child and then charge the parent.
>> +	 */
>> +	if (parent->use_hierarchy) {
>
>
>> +		res_counter_uncharge(&memcg->hugepage[idx], csize);
>> +		if (!mem_cgroup_is_root(parent))
>> +			ret = res_counter_charge(&parent->hugepage[idx],
>> +						 csize, &fail_res);
>
>
> Ah, why is !mem_cgroup_is_root() checked ? no res_counter update for
> root cgroup ?

My mistake. Earlier version of the patch series didn't charge/uncharge the root
cgroup during different operations. Later as per your review I updated
the charge/uncharge path to charge root cgroup. I missed to update this code.

>
> I think it's better to have res_counter_move_parent()...to do ops in atomic.
> (I'll post a patch for that for my purpose). OR, just ignore res->usage if
> parent->use_hierarchy == 1.
>
> uncharge->charge will have a race.



How about the below

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 7b6e79a..5b4bc98 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3351,24 +3351,24 @@ int mem_cgroup_move_hugetlb_parent(int idx, struct cgroup *cgroup,
 
 	csize = PAGE_SIZE << compound_order(page);
 	/*
-	 * uncharge from child and charge the parent. If we have
-	 * use_hierarchy set, we can never fail here. In-order to make
-	 * sure we don't get -ENOMEM on parent charge, we first uncharge
-	 * the child and then charge the parent.
+	 * If we have use_hierarchy set we can never fail here. So instead of
+	 * using res_counter_uncharge use the open-coded variant which just
+	 * uncharge the child res_counter. The parent will retain the charge.
 	 */
 	if (parent->use_hierarchy) {
-		res_counter_uncharge(&memcg->hugepage[idx], csize);
-		if (!mem_cgroup_is_root(parent))
-			ret = res_counter_charge(&parent->hugepage[idx],
-						 csize, &fail_res);
+		unsigned long flags;
+		struct res_counter *counter;
+
+		counter = &memcg->hugepage[idx];
+		spin_lock_irqsave(&counter->lock, flags);
+		res_counter_uncharge_locked(counter, csize);
+		spin_unlock_irqrestore(&counter->lock, flags);
 	} else {
-		if (!mem_cgroup_is_root(parent)) {
-			ret = res_counter_charge(&parent->hugepage[idx],
-						 csize, &fail_res);
-			if (ret) {
-				ret = -EBUSY;
-				goto err_out;
-			}
+		ret = res_counter_charge(&parent->hugepage[idx],
+					 csize, &fail_res);
+		if (ret) {
+			ret = -EBUSY;
+			goto err_out;
 		}
 		res_counter_uncharge(&memcg->hugepage[idx], csize);
 	}


>
>> +	} else {
>> +		if (!mem_cgroup_is_root(parent)) {
>> +			ret = res_counter_charge(&parent->hugepage[idx],
>> +						 csize, &fail_res);
>> +			if (ret) {
>> +				ret = -EBUSY;
>> +				goto err_out;
>> +			}
>> +		}
>> +		res_counter_uncharge(&memcg->hugepage[idx], csize);
>> +	}
>
>
> Just a notice. Recently, Tejun changed failure of pre_destory() to show WARNING.
> Then, I'd like to move the usage to the root cgroup if use_hierarchy=0.
> Will it work for you ?

That should work.


>
>> +	/*
>> +	 * caller should have done css_get
>> +	 */
>
>
> Could you explain meaning of this comment ?
>

inherited from mem_cgroup_move_account. I guess it means css cannot go
away at this point. We have done a css_get on the child. For a generic
move_account function may be the comment is needed. I guess in our case
the comment is redundant ?

-aneesh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2012-04-09 10:00 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-06 18:50 [PATCH -V5 00/14] memcg: Add memcg extension to control HugeTLB allocation Aneesh Kumar K.V
2012-04-06 18:50 ` [PATCH -V5 01/14] hugetlb: rename max_hstate to hugetlb_max_hstate Aneesh Kumar K.V
2012-04-06 18:50 ` [PATCH -V5 02/14] hugetlbfs: don't use ERR_PTR with VM_FAULT* values Aneesh Kumar K.V
2012-04-06 18:50 ` [PATCH -V5 03/14] hugetlbfs: Add an inline helper for finding hstate index Aneesh Kumar K.V
2012-04-06 18:50 ` [PATCH -V5 04/14] hugetlb: Use mmu_gather instead of a temporary linked list for accumulating pages Aneesh Kumar K.V
2012-04-09  5:36   ` KAMEZAWA Hiroyuki
2012-04-06 18:50 ` [PATCH -V5 05/14] hugetlb: Avoid taking i_mmap_mutex in unmap_single_vma for hugetlb Aneesh Kumar K.V
2012-04-06 18:50 ` [PATCH -V5 06/14] hugetlb: Simplify migrate_huge_page Aneesh Kumar K.V
2012-04-09  5:47   ` KAMEZAWA Hiroyuki
2012-04-09  8:36     ` Aneesh Kumar K.V
2012-04-06 18:50 ` [PATCH -V5 07/14] memcg: Add HugeTLB extension Aneesh Kumar K.V
2012-04-09  6:04   ` KAMEZAWA Hiroyuki
2012-04-09  8:43     ` Aneesh Kumar K.V
2012-04-09  9:00       ` KAMEZAWA Hiroyuki
2012-04-06 18:50 ` [PATCH -V5 08/14] hugetlb: add charge/uncharge calls for HugeTLB alloc/free Aneesh Kumar K.V
2012-04-06 18:50 ` [PATCH -V5 09/14] memcg: track resource index in cftype private Aneesh Kumar K.V
2012-04-09  5:56   ` KAMEZAWA Hiroyuki
2012-04-06 18:50 ` [PATCH -V5 10/14] hugetlbfs: Add memcg control files for hugetlbfs Aneesh Kumar K.V
2012-04-09  6:00   ` KAMEZAWA Hiroyuki
2012-04-09  8:46     ` Aneesh Kumar K.V
2012-04-06 18:50 ` [PATCH -V5 11/14] hugetlbfs: Add a list for tracking in-use HugeTLB pages Aneesh Kumar K.V
2012-04-06 18:50 ` [PATCH -V5 12/14] memcg: move HugeTLB resource count to parent cgroup on memcg removal Aneesh Kumar K.V
2012-04-09  6:16   ` KAMEZAWA Hiroyuki
2012-04-09 10:00     ` Aneesh Kumar K.V [this message]
2012-04-10  6:55       ` KAMEZAWA Hiroyuki
2012-04-06 18:50 ` [PATCH -V5 13/14] hugetlb: migrate memcg info from oldpage to new page during migration Aneesh Kumar K.V
2012-04-06 18:51 ` [PATCH -V5 14/14] memcg: Add memory controller documentation for hugetlb management Aneesh Kumar K.V

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ty0tcjhx.fsf@skywalker.in.ibm.com \
    --to=aneesh.kumar@linux.vnet.ibm.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=dhillf@gmail.com \
    --cc=hannes@cmpxchg.org \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox