linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] memcg: don't register hotcpu notifier from ->css_alloc()
       [not found] <20121214012436.GA25481@localhost>
@ 2012-12-18 15:40 ` Tejun Heo
  2012-12-18 16:40   ` Michal Hocko
  0 siblings, 1 reply; 3+ messages in thread
From: Tejun Heo @ 2012-12-18 15:40 UTC (permalink / raw)
  To: Michal Hocko, Johannes Weiner, KAMEZAWA Hiroyuki, Balbir Singh
  Cc: LKML, Fengguang Wu, cgroups, linux-mm, Andrew Morton

648bb56d07 ("cgroup: lock cgroup_mutex in cgroup_init_subsys()") made
cgroup_init_subsys() grab cgroup_mutex before invoking ->css_alloc()
for the root css.  Because memcg registers hotcpu notifier from
->css_alloc() for the root css, this introduced circular locking
dependency between cgroup_mutex and cpu hotplug.

Fix it by moving hotcpu notifier registration to a subsys initcall.

  ======================================================
  [ INFO: possible circular locking dependency detected ]
  3.7.0-rc4-work+ #42 Not tainted
  -------------------------------------------------------
  bash/645 is trying to acquire lock:
   (cgroup_mutex){+.+.+.}, at: [<ffffffff8110c5b7>] cgroup_lock+0x17/0x20

  but task is already holding lock:
   (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff8109300f>] cpu_hotplug_begin+0x2f/0x60

  which lock already depends on the new lock.


  the existing dependency chain (in reverse order) is:

 -> #1 (cpu_hotplug.lock){+.+.+.}:
         [<ffffffff810f8357>] lock_acquire+0x97/0x1e0
         [<ffffffff81be4701>] mutex_lock_nested+0x61/0x3b0
         [<ffffffff810930fc>] get_online_cpus+0x3c/0x60
         [<ffffffff811152fb>] rebuild_sched_domains_locked+0x1b/0x70
         [<ffffffff81116718>] cpuset_write_resmask+0x298/0x2c0
         [<ffffffff8110f70f>] cgroup_file_write+0x1ef/0x300
         [<ffffffff811c3b78>] vfs_write+0xa8/0x160
         [<ffffffff811c3e82>] sys_write+0x52/0xa0
         [<ffffffff81be89c2>] system_call_fastpath+0x16/0x1b

 -> #0 (cgroup_mutex){+.+.+.}:
         [<ffffffff810f74de>] __lock_acquire+0x14ce/0x1d20
         [<ffffffff810f8357>] lock_acquire+0x97/0x1e0
         [<ffffffff81be4701>] mutex_lock_nested+0x61/0x3b0
         [<ffffffff8110c5b7>] cgroup_lock+0x17/0x20
         [<ffffffff81116deb>] cpuset_handle_hotplug+0x1b/0x560
         [<ffffffff8111744e>] cpuset_update_active_cpus+0xe/0x10
         [<ffffffff810d0587>] cpuset_cpu_inactive+0x47/0x50
         [<ffffffff810c1476>] notifier_call_chain+0x66/0x150
         [<ffffffff810c156e>] __raw_notifier_call_chain+0xe/0x10
         [<ffffffff81092fa0>] __cpu_notify+0x20/0x40
         [<ffffffff81b9827e>] _cpu_down+0x7e/0x2f0
         [<ffffffff81b98526>] cpu_down+0x36/0x50
         [<ffffffff81b9c12d>] store_online+0x5d/0xe0
         [<ffffffff816b6ef8>] dev_attr_store+0x18/0x30
         [<ffffffff8123bb50>] sysfs_write_file+0xe0/0x150
         [<ffffffff811c3b78>] vfs_write+0xa8/0x160
         [<ffffffff811c3e82>] sys_write+0x52/0xa0
         [<ffffffff81be89c2>] system_call_fastpath+0x16/0x1b
  other info that might help us debug this:

   Possible unsafe locking scenario:

         CPU0                    CPU1
         ----                    ----
    lock(cpu_hotplug.lock);
                                 lock(cgroup_mutex);
                                 lock(cpu_hotplug.lock);
    lock(cgroup_mutex);

   *** DEADLOCK ***

  5 locks held by bash/645:
   #0:  (&buffer->mutex){+.+.+.}, at: [<ffffffff8123bab8>] sysfs_write_file+0x48/0x150
   #1:  (s_active#42){.+.+.+}, at: [<ffffffff8123bb38>] sysfs_write_file+0xc8/0x150
   #2:  (x86_cpu_hotplug_driver_mutex){+.+...}, at: [<ffffffff81079277>] cpu_hotplug_driver_lock+0x1
+7/0x20
   #3:  (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff81093157>] cpu_maps_update_begin+0x17/0x20
   #4:  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff8109300f>] cpu_hotplug_begin+0x2f/0x60

  stack backtrace:
  Pid: 645, comm: bash Not tainted 3.7.0-rc4-work+ #42
  Call Trace:
   [<ffffffff81bdadfd>] print_circular_bug+0x28e/0x29f
   [<ffffffff810f74de>] __lock_acquire+0x14ce/0x1d20
   [<ffffffff810f8357>] lock_acquire+0x97/0x1e0
   [<ffffffff81be4701>] mutex_lock_nested+0x61/0x3b0
   [<ffffffff8110c5b7>] cgroup_lock+0x17/0x20
   [<ffffffff81116deb>] cpuset_handle_hotplug+0x1b/0x560
   [<ffffffff8111744e>] cpuset_update_active_cpus+0xe/0x10
   [<ffffffff810d0587>] cpuset_cpu_inactive+0x47/0x50
   [<ffffffff810c1476>] notifier_call_chain+0x66/0x150
   [<ffffffff810c156e>] __raw_notifier_call_chain+0xe/0x10
   [<ffffffff81092fa0>] __cpu_notify+0x20/0x40
   [<ffffffff81b9827e>] _cpu_down+0x7e/0x2f0
   [<ffffffff81b98526>] cpu_down+0x36/0x50
   [<ffffffff81b9c12d>] store_online+0x5d/0xe0
   [<ffffffff816b6ef8>] dev_attr_store+0x18/0x30
   [<ffffffff8123bb50>] sysfs_write_file+0xe0/0x150
   [<ffffffff811c3b78>] vfs_write+0xa8/0x160
   [<ffffffff811c3e82>] sys_write+0x52/0xa0
   [<ffffffff81be89c2>] system_call_fastpath+0x16/0x1b

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Cc: Michal Hocko <mhocko@suse.cz>
---
Michal, if it looks okay, can you please route this patch?

Thanks.

 mm/memcontrol.c |   14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 12307b3..7d8a27f 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -4982,7 +4982,6 @@ mem_cgroup_css_alloc(struct cgroup *cont)
 						&per_cpu(memcg_stock, cpu);
 			INIT_WORK(&stock->work, drain_local_stock);
 		}
-		hotcpu_notifier(memcg_cpu_hotplug_callback, 0);
 	} else {
 		parent = mem_cgroup_from_cont(cont->parent);
 		memcg->use_hierarchy = parent->use_hierarchy;
@@ -5644,6 +5643,19 @@ struct cgroup_subsys mem_cgroup_subsys = {
 	.use_id = 1,
 };
 
+/*
+ * The rest of init is performed during ->css_alloc() for root css which
+ * happens before initcalls.  hotcpu_notifier() can't be done together as
+ * it would introduce circular locking by adding cgroup_lock -> cpu hotplug
+ * dependency.  Do it from a subsys_initcall().
+ */
+static int __init mem_cgroup_init(void)
+{
+	hotcpu_notifier(memcg_cpu_hotplug_callback, 0);
+	return 0;
+}
+subsys_initcall(mem_cgroup_init);
+
 #ifdef CONFIG_MEMCG_SWAP
 static int __init enable_swap_account(char *s)
 {

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] memcg: don't register hotcpu notifier from ->css_alloc()
  2012-12-18 15:40 ` [PATCH] memcg: don't register hotcpu notifier from ->css_alloc() Tejun Heo
@ 2012-12-18 16:40   ` Michal Hocko
  2012-12-18 16:47     ` Tejun Heo
  0 siblings, 1 reply; 3+ messages in thread
From: Michal Hocko @ 2012-12-18 16:40 UTC (permalink / raw)
  To: Tejun Heo, Andrew Morton
  Cc: Johannes Weiner, KAMEZAWA Hiroyuki, Balbir Singh, LKML,
	Fengguang Wu, cgroups, linux-mm

On Tue 18-12-12 07:40:30, Tejun Heo wrote:
> 648bb56d07 ("cgroup: lock cgroup_mutex in cgroup_init_subsys()") made
> cgroup_init_subsys() grab cgroup_mutex before invoking ->css_alloc()
> for the root css.  Because memcg registers hotcpu notifier from
> ->css_alloc() for the root css, this introduced circular locking
> dependency between cgroup_mutex and cpu hotplug.
> 
> Fix it by moving hotcpu notifier registration to a subsys initcall.
> 
>   ======================================================
>   [ INFO: possible circular locking dependency detected ]
>   3.7.0-rc4-work+ #42 Not tainted
>   -------------------------------------------------------
>   bash/645 is trying to acquire lock:
>    (cgroup_mutex){+.+.+.}, at: [<ffffffff8110c5b7>] cgroup_lock+0x17/0x20
> 
>   but task is already holding lock:
>    (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff8109300f>] cpu_hotplug_begin+0x2f/0x60
> 
>   which lock already depends on the new lock.
> 
> 
>   the existing dependency chain (in reverse order) is:
> 
>  -> #1 (cpu_hotplug.lock){+.+.+.}:
>          [<ffffffff810f8357>] lock_acquire+0x97/0x1e0
>          [<ffffffff81be4701>] mutex_lock_nested+0x61/0x3b0
>          [<ffffffff810930fc>] get_online_cpus+0x3c/0x60
>          [<ffffffff811152fb>] rebuild_sched_domains_locked+0x1b/0x70
>          [<ffffffff81116718>] cpuset_write_resmask+0x298/0x2c0
>          [<ffffffff8110f70f>] cgroup_file_write+0x1ef/0x300
>          [<ffffffff811c3b78>] vfs_write+0xa8/0x160
>          [<ffffffff811c3e82>] sys_write+0x52/0xa0
>          [<ffffffff81be89c2>] system_call_fastpath+0x16/0x1b
> 
>  -> #0 (cgroup_mutex){+.+.+.}:
>          [<ffffffff810f74de>] __lock_acquire+0x14ce/0x1d20
>          [<ffffffff810f8357>] lock_acquire+0x97/0x1e0
>          [<ffffffff81be4701>] mutex_lock_nested+0x61/0x3b0
>          [<ffffffff8110c5b7>] cgroup_lock+0x17/0x20
>          [<ffffffff81116deb>] cpuset_handle_hotplug+0x1b/0x560
>          [<ffffffff8111744e>] cpuset_update_active_cpus+0xe/0x10
>          [<ffffffff810d0587>] cpuset_cpu_inactive+0x47/0x50
>          [<ffffffff810c1476>] notifier_call_chain+0x66/0x150
>          [<ffffffff810c156e>] __raw_notifier_call_chain+0xe/0x10
>          [<ffffffff81092fa0>] __cpu_notify+0x20/0x40
>          [<ffffffff81b9827e>] _cpu_down+0x7e/0x2f0
>          [<ffffffff81b98526>] cpu_down+0x36/0x50
>          [<ffffffff81b9c12d>] store_online+0x5d/0xe0
>          [<ffffffff816b6ef8>] dev_attr_store+0x18/0x30
>          [<ffffffff8123bb50>] sysfs_write_file+0xe0/0x150
>          [<ffffffff811c3b78>] vfs_write+0xa8/0x160
>          [<ffffffff811c3e82>] sys_write+0x52/0xa0
>          [<ffffffff81be89c2>] system_call_fastpath+0x16/0x1b
>   other info that might help us debug this:
> 
>    Possible unsafe locking scenario:
> 
>          CPU0                    CPU1
>          ----                    ----
>     lock(cpu_hotplug.lock);
>                                  lock(cgroup_mutex);
>                                  lock(cpu_hotplug.lock);
>     lock(cgroup_mutex);
> 
>    *** DEADLOCK ***
> 
>   5 locks held by bash/645:
>    #0:  (&buffer->mutex){+.+.+.}, at: [<ffffffff8123bab8>] sysfs_write_file+0x48/0x150
>    #1:  (s_active#42){.+.+.+}, at: [<ffffffff8123bb38>] sysfs_write_file+0xc8/0x150
>    #2:  (x86_cpu_hotplug_driver_mutex){+.+...}, at: [<ffffffff81079277>] cpu_hotplug_driver_lock+0x1
> +7/0x20
>    #3:  (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff81093157>] cpu_maps_update_begin+0x17/0x20
>    #4:  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff8109300f>] cpu_hotplug_begin+0x2f/0x60
> 
>   stack backtrace:
>   Pid: 645, comm: bash Not tainted 3.7.0-rc4-work+ #42
>   Call Trace:
>    [<ffffffff81bdadfd>] print_circular_bug+0x28e/0x29f
>    [<ffffffff810f74de>] __lock_acquire+0x14ce/0x1d20
>    [<ffffffff810f8357>] lock_acquire+0x97/0x1e0
>    [<ffffffff81be4701>] mutex_lock_nested+0x61/0x3b0
>    [<ffffffff8110c5b7>] cgroup_lock+0x17/0x20
>    [<ffffffff81116deb>] cpuset_handle_hotplug+0x1b/0x560
>    [<ffffffff8111744e>] cpuset_update_active_cpus+0xe/0x10
>    [<ffffffff810d0587>] cpuset_cpu_inactive+0x47/0x50
>    [<ffffffff810c1476>] notifier_call_chain+0x66/0x150
>    [<ffffffff810c156e>] __raw_notifier_call_chain+0xe/0x10
>    [<ffffffff81092fa0>] __cpu_notify+0x20/0x40
>    [<ffffffff81b9827e>] _cpu_down+0x7e/0x2f0
>    [<ffffffff81b98526>] cpu_down+0x36/0x50
>    [<ffffffff81b9c12d>] store_online+0x5d/0xe0
>    [<ffffffff816b6ef8>] dev_attr_store+0x18/0x30
>    [<ffffffff8123bb50>] sysfs_write_file+0xe0/0x150
>    [<ffffffff811c3b78>] vfs_write+0xa8/0x160
>    [<ffffffff811c3e82>] sys_write+0x52/0xa0
>    [<ffffffff81be89c2>] system_call_fastpath+0x16/0x1b
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Reported-by: Fengguang Wu <fengguang.wu@intel.com>
> Cc: Michal Hocko <mhocko@suse.cz>

Acked-by: Michal Hocko <mhocko@suse.cz>

> ---
> Michal, if it looks okay, can you please route this patch?

Andrew, could you pick this one up, please?

> Thanks.
> 
>  mm/memcontrol.c |   14 +++++++++++++-
>  1 file changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 12307b3..7d8a27f 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -4982,7 +4982,6 @@ mem_cgroup_css_alloc(struct cgroup *cont)
>  						&per_cpu(memcg_stock, cpu);
>  			INIT_WORK(&stock->work, drain_local_stock);
>  		}
> -		hotcpu_notifier(memcg_cpu_hotplug_callback, 0);
>  	} else {
>  		parent = mem_cgroup_from_cont(cont->parent);
>  		memcg->use_hierarchy = parent->use_hierarchy;
> @@ -5644,6 +5643,19 @@ struct cgroup_subsys mem_cgroup_subsys = {
>  	.use_id = 1,
>  };
>  
> +/*
> + * The rest of init is performed during ->css_alloc() for root css which
> + * happens before initcalls.  hotcpu_notifier() can't be done together as
> + * it would introduce circular locking by adding cgroup_lock -> cpu hotplug
> + * dependency.  Do it from a subsys_initcall().
> + */
> +static int __init mem_cgroup_init(void)
> +{
> +	hotcpu_notifier(memcg_cpu_hotplug_callback, 0);

Hmm, we can move enable_swap_cgroup() and per-cpu memcg_stock
initialization here as well to make the css_alloc a bit cleaner.
mem_cgroup_soft_limit_tree_init with a trivial BUG_ON() on allocation
failure can go there as well. 

I will do it.

> +	return 0;
> +}
> +subsys_initcall(mem_cgroup_init);
> +
>  #ifdef CONFIG_MEMCG_SWAP
>  static int __init enable_swap_account(char *s)
>  {
> --
> To unsubscribe from this list: send the line "unsubscribe cgroups" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] memcg: don't register hotcpu notifier from ->css_alloc()
  2012-12-18 16:40   ` Michal Hocko
@ 2012-12-18 16:47     ` Tejun Heo
  0 siblings, 0 replies; 3+ messages in thread
From: Tejun Heo @ 2012-12-18 16:47 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, Johannes Weiner, KAMEZAWA Hiroyuki, Balbir Singh,
	LKML, Fengguang Wu, cgroups, linux-mm

Hey, Michal.

On Tue, Dec 18, 2012 at 05:40:22PM +0100, Michal Hocko wrote:
> > +/*
> > + * The rest of init is performed during ->css_alloc() for root css which
> > + * happens before initcalls.  hotcpu_notifier() can't be done together as
> > + * it would introduce circular locking by adding cgroup_lock -> cpu hotplug
> > + * dependency.  Do it from a subsys_initcall().
> > + */
> > +static int __init mem_cgroup_init(void)
> > +{
> > +	hotcpu_notifier(memcg_cpu_hotplug_callback, 0);
> 
> Hmm, we can move enable_swap_cgroup() and per-cpu memcg_stock
> initialization here as well to make the css_alloc a bit cleaner.
> mem_cgroup_soft_limit_tree_init with a trivial BUG_ON() on allocation
> failure can go there as well. 
> 
> I will do it.

The thing was that cgroup_init() happens before any initcalls so
you'll end up with root css being set up before other stuff gets
initialized, which could be okay but a bit nasty.  I'm wondering why
cgroup_init() has to happen so early.  The cpu one is already taking
an early init path, but the rest doesn't seem to need such early init.

Anyways, yeap, no objection to cleaning up anyway which fits memcg.
We can deal with the whole init order thing later.

Thanks.

-- 
tejun

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2012-12-18 16:47 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20121214012436.GA25481@localhost>
2012-12-18 15:40 ` [PATCH] memcg: don't register hotcpu notifier from ->css_alloc() Tejun Heo
2012-12-18 16:40   ` Michal Hocko
2012-12-18 16:47     ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox