linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [RFC][PATCH] memcg: fix a race when setting memcg.swappiness
@ 2009-01-14  3:24 Li Zefan
  2009-01-14  4:26 ` KAMEZAWA Hiroyuki
  0 siblings, 1 reply; 6+ messages in thread
From: Li Zefan @ 2009-01-14  3:24 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki, Balbir Singh, Paul Menage
  Cc: Daisuke Nishimura, Linux Containers, linux-mm

(suppose: memcg->use_hierarchy == 0 and memcg->swappiness == 60)

echo 10 > /memcg/0/swappiness   |
  mem_cgroup_swappiness_write() |
    ...                         | echo 1 > /memcg/0/use_hierarchy
                                | mkdir /mnt/0/1
                                |   sub_memcg->swappiness = 60;
    memcg->swappiness = 10;     |

In the above scenario, we end up having 2 different swappiness
values in a single hierarchy.

Note we can't use hierarchy_lock here, because it doesn't protect
the create() method.

Though IMO use cgroup_lock() in simple write functions is OK,
Paul would like to avoid it. And he sugguested use a counter to
count the number of children instead of check cgrp->children list:

=================
create() does:

lock memcg_parent
memcg->swappiness = memcg->parent->swappiness;
memcg_parent->child_count++;
unlock memcg_parent

and write() does:

lock memcg
if (!memcg->child_count) {
  memcg->swappiness = swappiness;
} else {
  report error;
}
unlock memcg

destroy() does:
lock memcg_parent
memcg_parent->child_count--;
unlock memcg_parent

=================

And there is a suble differnce with checking cgrp->children,
that a cgroup is removed from parent's list in cgroup_rmdir(),
while memcg->child_count is decremented in cgroup_diput().


Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
---
 mm/memcontrol.c |   10 +++++++++-
 1 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index e2996b8..0274223 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1971,6 +1971,7 @@ static int mem_cgroup_swappiness_write(struct cgroup *cgrp, struct cftype *cft,
 {
 	struct mem_cgroup *memcg = mem_cgroup_from_cont(cgrp);
 	struct mem_cgroup *parent;
+
 	if (val > 100)
 		return -EINVAL;
 
@@ -1978,15 +1979,22 @@ static int mem_cgroup_swappiness_write(struct cgroup *cgrp, struct cftype *cft,
 		return -EINVAL;
 
 	parent = mem_cgroup_from_cont(cgrp->parent);
+
+	cgroup_lock();
+
 	/* If under hierarchy, only empty-root can set this value */
 	if ((parent->use_hierarchy) ||
-	    (memcg->use_hierarchy && !list_empty(&cgrp->children)))
+	    (memcg->use_hierarchy && !list_empty(&cgrp->children))) {
+		cgroup_unlock();
 		return -EINVAL;
+	}
 
 	spin_lock(&memcg->reclaim_param_lock);
 	memcg->swappiness = val;
 	spin_unlock(&memcg->reclaim_param_lock);
 
+	cgroup_unlock();
+
 	return 0;
 }
 
-- 
1.5.4.rc3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC][PATCH] memcg: fix a race when setting memcg.swappiness
  2009-01-14  3:24 [RFC][PATCH] memcg: fix a race when setting memcg.swappiness Li Zefan
@ 2009-01-14  4:26 ` KAMEZAWA Hiroyuki
  2009-01-14  6:47   ` Li Zefan
  0 siblings, 1 reply; 6+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-01-14  4:26 UTC (permalink / raw)
  To: Li Zefan
  Cc: Balbir Singh, Paul Menage, Daisuke Nishimura, Linux Containers, linux-mm

On Wed, 14 Jan 2009 11:24:18 +0800
Li Zefan <lizf@cn.fujitsu.com> wrote:

> (suppose: memcg->use_hierarchy == 0 and memcg->swappiness == 60)
> 
> echo 10 > /memcg/0/swappiness   |
>   mem_cgroup_swappiness_write() |
>     ...                         | echo 1 > /memcg/0/use_hierarchy
>                                 | mkdir /mnt/0/1
>                                 |   sub_memcg->swappiness = 60;
>     memcg->swappiness = 10;     |
> 
> In the above scenario, we end up having 2 different swappiness
> values in a single hierarchy.
> 
> Note we can't use hierarchy_lock here, because it doesn't protect
> the create() method.
> 
> Though IMO use cgroup_lock() in simple write functions is OK,
> Paul would like to avoid it. And he sugguested use a counter to
> count the number of children instead of check cgrp->children list:
> 
> =================
> create() does:
> 
> lock memcg_parent
> memcg->swappiness = memcg->parent->swappiness;
> memcg_parent->child_count++;
> unlock memcg_parent
> 
> and write() does:
> 
> lock memcg
> if (!memcg->child_count) {
>   memcg->swappiness = swappiness;
> } else {
>   report error;
> }
> unlock memcg
> 
> destroy() does:
> lock memcg_parent
> memcg_parent->child_count--;
> unlock memcg_parent
> 
> =================
> 
> And there is a suble differnce with checking cgrp->children,
> that a cgroup is removed from parent's list in cgroup_rmdir(),
> while memcg->child_count is decremented in cgroup_diput().
> 
> 
> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>

Seems reasonable, but, hmm...

Why hierarchy_mutex can't be used for create() ?

-Kame

> ---
>  mm/memcontrol.c |   10 +++++++++-
>  1 files changed, 9 insertions(+), 1 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index e2996b8..0274223 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -1971,6 +1971,7 @@ static int mem_cgroup_swappiness_write(struct cgroup *cgrp, struct cftype *cft,
>  {
>  	struct mem_cgroup *memcg = mem_cgroup_from_cont(cgrp);
>  	struct mem_cgroup *parent;
> +
>  	if (val > 100)
>  		return -EINVAL;
>  
> @@ -1978,15 +1979,22 @@ static int mem_cgroup_swappiness_write(struct cgroup *cgrp, struct cftype *cft,
>  		return -EINVAL;
>  
>  	parent = mem_cgroup_from_cont(cgrp->parent);
> +
> +	cgroup_lock();
> +
>  	/* If under hierarchy, only empty-root can set this value */
>  	if ((parent->use_hierarchy) ||
> -	    (memcg->use_hierarchy && !list_empty(&cgrp->children)))
> +	    (memcg->use_hierarchy && !list_empty(&cgrp->children))) {
> +		cgroup_unlock();
>  		return -EINVAL;
> +	}
>  
>  	spin_lock(&memcg->reclaim_param_lock);
>  	memcg->swappiness = val;
>  	spin_unlock(&memcg->reclaim_param_lock);
>  
> +	cgroup_unlock();
> +
>  	return 0;
>  }
>  
> -- 
> 1.5.4.rc3
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC][PATCH] memcg: fix a race when setting memcg.swappiness
  2009-01-14  4:26 ` KAMEZAWA Hiroyuki
@ 2009-01-14  6:47   ` Li Zefan
  2009-01-14  7:05     ` KAMEZAWA Hiroyuki
  0 siblings, 1 reply; 6+ messages in thread
From: Li Zefan @ 2009-01-14  6:47 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Balbir Singh, Paul Menage, Daisuke Nishimura, Linux Containers, linux-mm

KAMEZAWA Hiroyuki wrote:
> On Wed, 14 Jan 2009 11:24:18 +0800
> Li Zefan <lizf@cn.fujitsu.com> wrote:
> 
>> (suppose: memcg->use_hierarchy == 0 and memcg->swappiness == 60)
>>
>> echo 10 > /memcg/0/swappiness   |
>>   mem_cgroup_swappiness_write() |
>>     ...                         | echo 1 > /memcg/0/use_hierarchy
>>                                 | mkdir /mnt/0/1
>>                                 |   sub_memcg->swappiness = 60;
>>     memcg->swappiness = 10;     |
>>
>> In the above scenario, we end up having 2 different swappiness
>> values in a single hierarchy.
>>
>> Note we can't use hierarchy_lock here, because it doesn't protect
>> the create() method.
>>
>> Though IMO use cgroup_lock() in simple write functions is OK,
>> Paul would like to avoid it. And he sugguested use a counter to
>> count the number of children instead of check cgrp->children list:
>>
>> =================
>> create() does:
>>
>> lock memcg_parent
>> memcg->swappiness = memcg->parent->swappiness;
>> memcg_parent->child_count++;
>> unlock memcg_parent
>>
>> and write() does:
>>
>> lock memcg
>> if (!memcg->child_count) {
>>   memcg->swappiness = swappiness;
>> } else {
>>   report error;
>> }
>> unlock memcg
>>
>> destroy() does:
>> lock memcg_parent
>> memcg_parent->child_count--;
>> unlock memcg_parent
>>
>> =================
>>
>> And there is a suble differnce with checking cgrp->children,
>> that a cgroup is removed from parent's list in cgroup_rmdir(),
>> while memcg->child_count is decremented in cgroup_diput().
>>
>>
>> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
> 
> Seems reasonable, but, hmm...
> 

Do you mean you agree to avoid using cgroup_lock()?

> Why hierarchy_mutex can't be used for create() ?
> 

We can make hierarchy_mutex work for this race by:

@@ -2403,16 +2403,18 @@ static long cgroup_create(struct cgroup *parent, struct
        if (notify_on_release(parent))
                set_bit(CGRP_NOTIFY_ON_RELEASE, &cgrp->flags);

+       cgroup_lock_hierarchy(root);
+
        for_each_subsys(root, ss) {
                struct cgroup_subsys_state *css = ss->create(ss, cgrp);
                if (IS_ERR(css)) {
+                       cgroup_unlock_hierarchy(root);
                        err = PTR_ERR(css);
                        goto err_destroy;
                }
                init_cgroup_css(css, ss, cgrp);
        }

-       cgroup_lock_hierarchy(root);
        list_add(&cgrp->sibling, &cgrp->parent->children);
        cgroup_unlock_hierarchy(root);
        root->number_of_cgroups++;

But this may not be what we want, because hierarchy_mutex is meant to be
lightweight, so it's not held while subsys callbacks are invoked, except
bind().

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC][PATCH] memcg: fix a race when setting memcg.swappiness
  2009-01-14  6:47   ` Li Zefan
@ 2009-01-14  7:05     ` KAMEZAWA Hiroyuki
  2009-01-14  7:22       ` Li Zefan
  0 siblings, 1 reply; 6+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-01-14  7:05 UTC (permalink / raw)
  To: Li Zefan
  Cc: Balbir Singh, Paul Menage, Daisuke Nishimura, Linux Containers, linux-mm

On Wed, 14 Jan 2009 14:47:18 +0800
Li Zefan <lizf@cn.fujitsu.com> wrote:

> KAMEZAWA Hiroyuki wrote:
> > On Wed, 14 Jan 2009 11:24:18 +0800
> > Li Zefan <lizf@cn.fujitsu.com> wrote:
> > 
> >> (suppose: memcg->use_hierarchy == 0 and memcg->swappiness == 60)
> >>
> >> echo 10 > /memcg/0/swappiness   |
> >>   mem_cgroup_swappiness_write() |
> >>     ...                         | echo 1 > /memcg/0/use_hierarchy
> >>                                 | mkdir /mnt/0/1
> >>                                 |   sub_memcg->swappiness = 60;
> >>     memcg->swappiness = 10;     |
> >>
> >> In the above scenario, we end up having 2 different swappiness
> >> values in a single hierarchy.
> >>
> >> Note we can't use hierarchy_lock here, because it doesn't protect
> >> the create() method.
> >>
> >> Though IMO use cgroup_lock() in simple write functions is OK,
> >> Paul would like to avoid it. And he sugguested use a counter to
> >> count the number of children instead of check cgrp->children list:
> >>
> >> =================
> >> create() does:
> >>
> >> lock memcg_parent
> >> memcg->swappiness = memcg->parent->swappiness;
> >> memcg_parent->child_count++;
> >> unlock memcg_parent
> >>
> >> and write() does:
> >>
> >> lock memcg
> >> if (!memcg->child_count) {
> >>   memcg->swappiness = swappiness;
> >> } else {
> >>   report error;
> >> }
> >> unlock memcg
> >>
> >> destroy() does:
> >> lock memcg_parent
> >> memcg_parent->child_count--;
> >> unlock memcg_parent
> >>
> >> =================
> >>
> >> And there is a suble differnce with checking cgrp->children,
> >> that a cgroup is removed from parent's list in cgroup_rmdir(),
> >> while memcg->child_count is decremented in cgroup_diput().
> >>
> >>
> >> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
> > 
> > Seems reasonable, but, hmm...
> > 
> 
> Do you mean you agree to avoid using cgroup_lock()?
> 
> > Why hierarchy_mutex can't be used for create() ?
> > 
> 
> We can make hierarchy_mutex work for this race by:
> 
> @@ -2403,16 +2403,18 @@ static long cgroup_create(struct cgroup *parent, struct
>         if (notify_on_release(parent))
>                 set_bit(CGRP_NOTIFY_ON_RELEASE, &cgrp->flags);
> 
> +       cgroup_lock_hierarchy(root);
> +
>         for_each_subsys(root, ss) {
>                 struct cgroup_subsys_state *css = ss->create(ss, cgrp);
>                 if (IS_ERR(css)) {
> +                       cgroup_unlock_hierarchy(root);
>                         err = PTR_ERR(css);
>                         goto err_destroy;
>                 }
>                 init_cgroup_css(css, ss, cgrp);
>         }
> 
> -       cgroup_lock_hierarchy(root);
>         list_add(&cgrp->sibling, &cgrp->parent->children);
>         cgroup_unlock_hierarchy(root);
>         root->number_of_cgroups++;
> 
> But this may not be what we want, because hierarchy_mutex is meant to be
> lightweight, so it's not held while subsys callbacks are invoked, except
> bind().
> 

Ah, I see your point. But "we can't trust hieararchy_lock for create()"
is a probelm.  How about following ?
==
for_each-subsys(root,ss) {
	if (ss->create) {
		mutex_lock(&ss->hierarchy_mutex);
		css = ss->create(ss, cgroup);
		mutex_unlock(&ss->hierarchy_mutex);
		if (IS_ERR(...)) {
		}
	}
==

-Kame



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC][PATCH] memcg: fix a race when setting memcg.swappiness
  2009-01-14  7:05     ` KAMEZAWA Hiroyuki
@ 2009-01-14  7:22       ` Li Zefan
  2009-01-14  7:29         ` KAMEZAWA Hiroyuki
  0 siblings, 1 reply; 6+ messages in thread
From: Li Zefan @ 2009-01-14  7:22 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Balbir Singh, Paul Menage, Daisuke Nishimura, Linux Containers, linux-mm

KAMEZAWA Hiroyuki wrote:
> On Wed, 14 Jan 2009 14:47:18 +0800
> Li Zefan <lizf@cn.fujitsu.com> wrote:
> 
>> KAMEZAWA Hiroyuki wrote:
>>> On Wed, 14 Jan 2009 11:24:18 +0800
>>> Li Zefan <lizf@cn.fujitsu.com> wrote:
>>>
>>>> (suppose: memcg->use_hierarchy == 0 and memcg->swappiness == 60)
>>>>
>>>> echo 10 > /memcg/0/swappiness   |
>>>>   mem_cgroup_swappiness_write() |
>>>>     ...                         | echo 1 > /memcg/0/use_hierarchy
>>>>                                 | mkdir /mnt/0/1
>>>>                                 |   sub_memcg->swappiness = 60;
>>>>     memcg->swappiness = 10;     |
>>>>
>>>> In the above scenario, we end up having 2 different swappiness
>>>> values in a single hierarchy.
>>>>
>>>> Note we can't use hierarchy_lock here, because it doesn't protect
>>>> the create() method.
>>>>
>>>> Though IMO use cgroup_lock() in simple write functions is OK,
>>>> Paul would like to avoid it. And he sugguested use a counter to
>>>> count the number of children instead of check cgrp->children list:
>>>>
>>>> =================
>>>> create() does:
>>>>
>>>> lock memcg_parent
>>>> memcg->swappiness = memcg->parent->swappiness;
>>>> memcg_parent->child_count++;
>>>> unlock memcg_parent
>>>>
>>>> and write() does:
>>>>
>>>> lock memcg
>>>> if (!memcg->child_count) {
>>>>   memcg->swappiness = swappiness;
>>>> } else {
>>>>   report error;
>>>> }
>>>> unlock memcg
>>>>
>>>> destroy() does:
>>>> lock memcg_parent
>>>> memcg_parent->child_count--;
>>>> unlock memcg_parent
>>>>
>>>> =================
>>>>
>>>> And there is a suble differnce with checking cgrp->children,
>>>> that a cgroup is removed from parent's list in cgroup_rmdir(),
>>>> while memcg->child_count is decremented in cgroup_diput().
>>>>
>>>>
>>>> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
>>> Seems reasonable, but, hmm...
>>>
>> Do you mean you agree to avoid using cgroup_lock()?
>>
>>> Why hierarchy_mutex can't be used for create() ?
>>>
>> We can make hierarchy_mutex work for this race by:
>>
>> @@ -2403,16 +2403,18 @@ static long cgroup_create(struct cgroup *parent, struct
>>         if (notify_on_release(parent))
>>                 set_bit(CGRP_NOTIFY_ON_RELEASE, &cgrp->flags);
>>
>> +       cgroup_lock_hierarchy(root);
>> +
>>         for_each_subsys(root, ss) {
>>                 struct cgroup_subsys_state *css = ss->create(ss, cgrp);
>>                 if (IS_ERR(css)) {
>> +                       cgroup_unlock_hierarchy(root);
>>                         err = PTR_ERR(css);
>>                         goto err_destroy;
>>                 }
>>                 init_cgroup_css(css, ss, cgrp);
>>         }
>>
>> -       cgroup_lock_hierarchy(root);
>>         list_add(&cgrp->sibling, &cgrp->parent->children);
>>         cgroup_unlock_hierarchy(root);
>>         root->number_of_cgroups++;
>>
>> But this may not be what we want, because hierarchy_mutex is meant to be
>> lightweight, so it's not held while subsys callbacks are invoked, except
>> bind().
>>
> 
> Ah, I see your point. But "we can't trust hieararchy_lock for create()"
> is a probelm.  How about following ?

Yes, it can be a problem I think, so should be used carefully..

> ==
> for_each-subsys(root,ss) {
> 	if (ss->create) {
> 		mutex_lock(&ss->hierarchy_mutex);
> 		css = ss->create(ss, cgroup);
> 		mutex_unlock(&ss->hierarchy_mutex);
> 		if (IS_ERR(...)) {
> 		}
> 	}

This won't work. :(

The lock should include both create() and list_add(&cgrp->sibling, &cgrp->parent->children);


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC][PATCH] memcg: fix a race when setting memcg.swappiness
  2009-01-14  7:22       ` Li Zefan
@ 2009-01-14  7:29         ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 6+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-01-14  7:29 UTC (permalink / raw)
  To: Li Zefan
  Cc: Balbir Singh, Paul Menage, Daisuke Nishimura, Linux Containers, linux-mm

On Wed, 14 Jan 2009 15:22:06 +0800
Li Zefan <lizf@cn.fujitsu.com> wrote:

> KAMEZAWA Hiroyuki wrote:
> > On Wed, 14 Jan 2009 14:47:18 +0800
> > Li Zefan <lizf@cn.fujitsu.com> wrote:
> > 
> >> KAMEZAWA Hiroyuki wrote:
> >>> On Wed, 14 Jan 2009 11:24:18 +0800
> >>> Li Zefan <lizf@cn.fujitsu.com> wrote:
> >>>
> >>>> (suppose: memcg->use_hierarchy == 0 and memcg->swappiness == 60)
> >>>>
> >>>> echo 10 > /memcg/0/swappiness   |
> >>>>   mem_cgroup_swappiness_write() |
> >>>>     ...                         | echo 1 > /memcg/0/use_hierarchy
> >>>>                                 | mkdir /mnt/0/1
> >>>>                                 |   sub_memcg->swappiness = 60;
> >>>>     memcg->swappiness = 10;     |
> >>>>
> >>>> In the above scenario, we end up having 2 different swappiness
> >>>> values in a single hierarchy.
> >>>>
> >>>> Note we can't use hierarchy_lock here, because it doesn't protect
> >>>> the create() method.
> >>>>
> >>>> Though IMO use cgroup_lock() in simple write functions is OK,
> >>>> Paul would like to avoid it. And he sugguested use a counter to
> >>>> count the number of children instead of check cgrp->children list:
> >>>>
> >>>> =================
> >>>> create() does:
> >>>>
> >>>> lock memcg_parent
> >>>> memcg->swappiness = memcg->parent->swappiness;
> >>>> memcg_parent->child_count++;
> >>>> unlock memcg_parent
> >>>>
> >>>> and write() does:
> >>>>
> >>>> lock memcg
> >>>> if (!memcg->child_count) {
> >>>>   memcg->swappiness = swappiness;
> >>>> } else {
> >>>>   report error;
> >>>> }
> >>>> unlock memcg
> >>>>
> >>>> destroy() does:
> >>>> lock memcg_parent
> >>>> memcg_parent->child_count--;
> >>>> unlock memcg_parent
> >>>>
> >>>> =================
> >>>>
> >>>> And there is a suble differnce with checking cgrp->children,
> >>>> that a cgroup is removed from parent's list in cgroup_rmdir(),
> >>>> while memcg->child_count is decremented in cgroup_diput().
> >>>>
> >>>>
> >>>> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
> >>> Seems reasonable, but, hmm...
> >>>
> >> Do you mean you agree to avoid using cgroup_lock()?
> >>
> >>> Why hierarchy_mutex can't be used for create() ?
> >>>
> >> We can make hierarchy_mutex work for this race by:
> >>
> >> @@ -2403,16 +2403,18 @@ static long cgroup_create(struct cgroup *parent, struct
> >>         if (notify_on_release(parent))
> >>                 set_bit(CGRP_NOTIFY_ON_RELEASE, &cgrp->flags);
> >>
> >> +       cgroup_lock_hierarchy(root);
> >> +
> >>         for_each_subsys(root, ss) {
> >>                 struct cgroup_subsys_state *css = ss->create(ss, cgrp);
> >>                 if (IS_ERR(css)) {
> >> +                       cgroup_unlock_hierarchy(root);
> >>                         err = PTR_ERR(css);
> >>                         goto err_destroy;
> >>                 }
> >>                 init_cgroup_css(css, ss, cgrp);
> >>         }
> >>
> >> -       cgroup_lock_hierarchy(root);
> >>         list_add(&cgrp->sibling, &cgrp->parent->children);
> >>         cgroup_unlock_hierarchy(root);
> >>         root->number_of_cgroups++;
> >>
> >> But this may not be what we want, because hierarchy_mutex is meant to be
> >> lightweight, so it's not held while subsys callbacks are invoked, except
> >> bind().
> >>
> > 
> > Ah, I see your point. But "we can't trust hieararchy_lock for create()"
> > is a probelm.  How about following ?
> 
> Yes, it can be a problem I think, so should be used carefully..
> 
> > ==
> > for_each-subsys(root,ss) {
> > 	if (ss->create) {
> > 		mutex_lock(&ss->hierarchy_mutex);
> > 		css = ss->create(ss, cgroup);
> > 		mutex_unlock(&ss->hierarchy_mutex);
> > 		if (IS_ERR(...)) {
> > 		}
> > 	}
> 
> This won't work. :(
> 
> The lock should include both create() and list_add(&cgrp->sibling, &cgrp->parent->children);
> 
> 
I see.  Hmm...it seems that we have to use cgroup_lock, now. please go ahead.
memory.use_hierarchy file also uses cgroup_lock.

Acked-by; KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

Thank you!
-Kame

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2009-01-14  7:30 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-01-14  3:24 [RFC][PATCH] memcg: fix a race when setting memcg.swappiness Li Zefan
2009-01-14  4:26 ` KAMEZAWA Hiroyuki
2009-01-14  6:47   ` Li Zefan
2009-01-14  7:05     ` KAMEZAWA Hiroyuki
2009-01-14  7:22       ` Li Zefan
2009-01-14  7:29         ` KAMEZAWA Hiroyuki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox